HackTheBox | xorxorxor
Challenge Description:
Who needs AES when you have XOR?
What is XOR? (A short lesson in Logic)
Before we dive into this challenge, let’s take a journey through the world of logical operators and what they mean (specifically, XOR). However, before we can really understand what makes XOR special let’s break it down to its fundamental component - the OR statement.
Logic is built on this concept of propositions and their truth values. The OR statement is true in any case where at least one side of the proposition is true, regardless of what else is included. Another way of thinking about OR is that a proposition built on OR will only be false when both sides of the operator are false.
Truth table for OR:
T | F | |
T | T | T |
F | T | F |
The main difference between XOR and OR is in what the X stands for… exclusive. XOR statements are only true when ONLY one side is true. Here’s what that looks like in a truth table:
Truth table for XOR:
T | F | |
T | F | T |
F | T | F |
Another way of thinking about this, is that the truth value for a proposition built on XOR will be false any time both sides of the operator share the same truth value (even if both sides are true).
XOR in Computing
Now that we have a fundamental understanding of the Logic behind XOR, let’s put it into practice within computing. Imagine we have some binary input A, 10011011
and we XOR it with another input B, 11101010
. Then any place where both inputs share the same binary value will output a 0. In this example, C represents our output after performing a XOR operation.
A: 1 0 0 1 1 0 1 1
B: 1 1 1 0 1 0 1 0
C: 0 1 1 1 0 0 0 1
Apply this to the Challenge
What do we know about HTB challenges? The goal is to get a flag in the format of HTB{some_value}
right? We can leverage that knowledge to help us understand more about how the Challenge script is behaving even if our python/coding chops aren’t great.
Okay, the challenge has us download two files: a python script challenge.py
and a text file containing the Flag that has been encrypted using this script. The flag is:
134af6e1297bc4a96f6a87fe046684e8047084ee046d84c5282dd7ef292dc9
Let’s examine the python script a little to get a basic idea of what’s happening under the hood. Here are some snippets that we want to pay attention to.
The function that encrypts the data.
def encrypt(self, data: bytes) -> bytes:
xored = b''
for i in range(len(data)):
xored += bytes([data[i] ^ self.key[i % len(self.key)]])
The function that outputs the encrypted data:
print ('Flag:', crypto.encrypt(flag).hex())
Most importantly, the function that creates the key
def __init__(self):
self.key = os.urandom(4)
So what can we infer from just looking at the script?
- The encryption key is only 4 characters long
- The text file passed in is converted to binary bytes
- The encrypted output is represented in hexadecimal
Remember back to what we know about HTB Challenges, the flag ALWAYS starts with “HTB{” and look at what we discovered from analyzing the script… the key is 4 characters long. Can you see the correlation? Good.
Decrypting using XOR (the “hard” way)
Since we know the first 4 characters of the original message (“HTB{”) , we know the first 4 characters of the ciphertext (“13 4a f6 e1”), and we know that an XOR operation is being performed… we can figure out the key value! We’ll do it one character at a time. Using an ASCII Table we can grab the hex values of the characters we know.
Flag Plaintext (in Hex):
H = 48
T = 54
B = 42
{ = 7b
Flag Encrypted:
13
4a
f6
e1
What do we need to do from here? Simple, find what value we need to XOR the Input with to get the Output. To do that, let’s go back to representing things in 1’s and 0’s since it is easier to visualize. Hexadecimal is base 16, and Binary is Base 2. So each Hex digit is actually 4 bits (2^4). We need to do the first 4 characters to get the entire encryption key, let’s get started. (If you aren’t great with binary, just use a hex to binary converter). I’m going to format this how I did in the XOR in Computing section above. A represents our Plaintext, B will be the Key, and C will be the output. For the first example, I will break it down into steps, but then I’ll just show the solution for the rest.
48 –> 0100 1000
13 –> 0001 0011
A: 0 1 0 0 1 0 0 0
B: ? ? ? ? ? ? ? ?
C: 0 0 0 1 0 0 1 1
Looking at this, we just need to go bit by bit and think of what value for B will result in the output (C) we want. Remember that XOR is only false (0) when both inputs are the same. Using that knowledge, we can fill in B fairly quickly.
A: 0 1 0 0 1 0 0 0
B: 0 1 0 1 1 0 1 1
C: 0 0 0 1 0 0 1 1
From there, we just convert the binary back to hex. In this first case, 01011011
becomes 5b
. Applying this process to the rest of the characters, HTB{, we’re left with: 5b1eb49a as our key. From here, we just decode each character one by one by iterating through the key. To decode, you have the Output (C) and the key (B) so you’d be able to find the input/plaintext (A).
We could write a python script that iterates through each hex value and xor’s it with the key (since the process works in reverse), or we can just use a convenient online tool like this one to plug in what we know (the encrypted text and the key to xor it with).