XOR Chain Obfuscation Reversal: Sequential Key Recovery and Position-Derived Key Derivation Analysis

encoding_crypto_classical Difficulty 1–5 30 min certifiable

Theory

Reverse Engineering Methodology

An XOR chain applies multiple XOR operations to a data buffer in sequence, each with a different key: data XOR key1 XOR key2 XOR ... XOR keyN = ciphertext. Because XOR is associative and commutative, this is mathematically equivalent to XORing with a single combined key: K_combined = key1 XOR key2 XOR ... XOR keyN. To decrypt, XOR the ciphertext with all keys: ciphertext XOR key1 XOR key2 XOR ... XOR keyN = data. The order of XOR operations does not matter for the final result.

Static XOR chains use fixed byte sequences stored in the binary. Recovery is straightforward: locate each key in the binary (they appear as data constants or immediates), XOR them all together to produce K_combined, then XOR the ciphertext with K_combined.

Position-derived keys are more subtle. The key applied to each byte depends on its index within the buffer: ciphertext[i] = data[i] XOR f(i) where f(i) is a derivation function. Common derivation functions seen in CTF challenges: - f(i) = i % 256 — key is the index modulo 256 - f(i) = (i * C) % 256 — key is a linear function of index - f(i) = prev_output[i-1] — key derived from previous ciphertext byte (CBC-like chaining) - f(i) = seed ^ (i << 2) — bitwise derivation with a seed constant

Identifying position-derived XOR in disassembly. Look for a loop where the XOR instruction's second operand is a register that is modified within the loop body (incremented, multiplied, masked) rather than loaded from a fixed address. The modification to that register is the derivation function f(i).

Known-plaintext recovery. If any plaintext bytes are known (e.g., the flag prefix CTF{ occupies the first four bytes), the key bytes at those positions can be recovered directly: key[i] = ciphertext[i] XOR known_plain[i]. For position-derived keys, this reveals f(0), f(1), f(2), f(3), which may be enough to identify the derivation function by pattern recognition.

Technical Deep-Dive

from functools import reduce

def xor_combine_keys(*keys: bytes) -> bytes:
    """XOR multiple byte-string keys together into one combined key."""
    max_len = max(len(k) for k in keys)
    result = bytearray(max_len)
    for key in keys:
        for i, b in enumerate(key):
            result[i % max_len] ^= b
    return bytes(result)

def xor_decrypt_static(ciphertext: bytes, *keys: bytes) -> bytes:
    combined = xor_combine_keys(*keys)
    return bytes(c ^ combined[i % len(combined)] for i, c in enumerate(ciphertext))

def xor_decrypt_position_derived(ciphertext: bytes, derive_key: callable) -> bytes:
    return bytes(c ^ derive_key(i) for i, c in enumerate(ciphertext))

# --- Example: position-derived with linear function key = (i * 7 + 0x13) % 256
def linear_key(i: int) -> int:
    return (i * 7 + 0x13) % 256

# --- Recover derivation function from known-plaintext ---
def recover_key_stream(ciphertext: bytes, known_plain: bytes) -> bytes:
    return bytes(c ^ p for c, p in zip(ciphertext, known_plain))

# --- CBC-style chaining: each key byte is previous ciphertext byte ---
def xor_cbc_decrypt(ciphertext: bytes, iv: int = 0x00) -> bytes:
    result = []
    prev = iv
    for b in ciphertext:
        result.append(b ^ prev)
        prev = b
    return bytes(result)

# --- Brute-force linear derivation parameters ---
def brute_linear_derivation(ciphertext: bytes, known_plain: bytes) -> list[tuple[int, int]]:
    """Find (multiplier, addend) such that key[i] = (i*mult + add) % 256."""
    key_stream = recover_key_stream(ciphertext[:len(known_plain)], known_plain)
    candidates = []
    for mult in range(256):
        for add in range(256):
            if all(key_stream[i] == (i * mult + add) % 256 for i in range(len(key_stream))):
                candidates.append((mult, add))
    return candidates

# Static XOR chain: XOR two keys together then decrypt ciphertext
python3 -c "
import sys
ct  = open('ciphertext.bin', 'rb').read()
k1  = bytes.fromhex('deadbeef')
k2  = bytes.fromhex('cafebabe')
combined = bytes(a ^ b for a, b in zip(k1, k2))
pt = bytes(c ^ combined[i % len(combined)] for i, c in enumerate(ct))
sys.stdout.buffer.write(pt)
" > plaintext.bin

# Identify XOR loop in Ghidra: search for XOR instruction inside a loop body
# Decompiler view: look for 'data[i] ^ key[i]' or '^ (i & 0xff)' patterns

# Dynamic: trace XOR operations with PIN (Intel Pin)
# pin -t source/tools/ManualExamples/obj-intel64/inscount0.so -- ./target
# (custom Pintool needed for XOR tracing — see Intel Pin documentation)

Common Reversing Errors

1. Treating a chain as a cipher. An XOR chain with N fixed keys has exactly the same security as a single-key XOR. Analysts sometimes assume the chaining adds security and look for complex structure that does not exist. Always reduce to K_combined first.

2. Missing key repetition. If the ciphertext is longer than the combined key, the key repeats. Many analysts test only the first key-length bytes; the rest of the plaintext remains encrypted. Apply the key cyclically: key[i % len(key)].

3. Confusing XOR-chain with XOR-CBC. In XOR-CBC, the key for byte i depends on the previously output byte, so the key is not static and cannot be precomputed. If static key recovery produces garbage beyond the first few bytes, suspect chaining.

4. Not accounting for known-plaintext overlap. The flag prefix CTF{ is 4 bytes. If the key length is, say, 16 bytes, you recover only 4 of the 16 key bytes. The remaining 12 require either more known-plaintext or brute force. Use all known structure: flag prefix, suffix }, and any null terminators that follow.

5. Ignoring index overflow. Position-derived functions using i directly (not i % 256) can produce key bytes outside 0–255 unless masked. Verify the derivation function masks to a byte before XOR; if not, the disassembly will show a movzx (zero-extend) or and 0xff instruction that you must replicate in your Python decoder.

6. Wrong endianness on multi-byte XOR keys. A 4-byte XOR key stored as a dword in the binary is subject to little-endian byte order. 0xDEADBEEF in memory is bytes EF BE AD DE. Always verify the byte order by cross-checking your decryption against a known portion of the output.

Challenge Lab

Reinforce your learning with a hands-on generated challenge based on this card's competency.