License key generation

reverse_engineering Difficulty 1–5 30 min certifiable

Theory

Why This Matters

License key reversal sits at the intersection of software protection research, intellectual property law, and practical exploit development. Security researchers routinely analyze commercial software key validation algorithms as part of authorised assessments — vendors sometimes use the same flawed keygen logic in both consumer software and industrial SCADA components, where an attacker who can generate valid "license" tokens can unlock hidden diagnostic modes or escalate privileges. Malware families including Mirai variants have adopted serial-number-style tokens to gate access to C2 panels, and recovering the generation algorithm lets defenders generate decoy tokens for honeypot attribution. In CTF competition, the license-key archetype teaches constraint solving and algorithmic thinking that directly transfers to firmware authentication research.

Core Concept

A license key is a structured token that encodes constraints a validator checks. The validator reads segments of the token, applies arithmetic transformations, and checks that the result equals an expected value. The reverser's goal is to understand those constraints well enough to generate arbitrary valid tokens — a keygen.

Checksum schemes fall into three categories commonly seen in binaries. The Luhn algorithm (used in credit card numbers) computes a weighted digit sum and checks divisibility by 10. CRC32 applies a polynomial-division algorithm against a 256-entry lookup table; the table is the fingerprint — search memory for 0x77073096 (the first non-zero CRC32 table entry). Custom checksums are the most common CTF variant: a loop that accumulates a sum, product, or XOR of key characters, then checks the final value against a hardcoded expected result.

Modular arithmetic constraints appear whenever the validator checks (key_value % N) == R. Reversing this means solving for any key_value satisfying the congruence — infinitely many solutions exist, so keygen is trivial: pick any multiple of N and add R. When the constraint is (a * x) % N == b, recovering x requires the modular inverse: x = b * modinv(a, N) (valid only when gcd(a, N) == 1). Python's pow(a, -1, N) computes this directly in Python 3.8+.

Z3 SMT solver handles complex multi-constraint systems: model the key bytes as BitVec variables, add each validator check as an assertion, call solve(), and extract any satisfying assignment. This replaces manual algebra for validators with five or more interacting constraints.

The crucial methodological distinction: finding a valid key answers the CTF challenge; understanding the algorithm produces a keygen and demonstrates full reversal — the higher-value skill in professional engagements.

Technical Deep-Dive

// Decompiled Ghidra output — typical 3-segment license validator:
// Key format:  XXXXX-YYYYY-ZZZZZ  (groups of 5 decimal digits)

int validate_license(char *key) {
    int seg[3];
    // parse three 5-digit segments
    sscanf(key, "%5d-%5d-%5d", &seg[0], &seg[1], &seg[2]);

    // Constraint 1: first segment must be divisible by 7
    if (seg[0] % 7 != 0) return 0;

    // Constraint 2: second segment is CRC32 of first (mod 100000)
    if (seg[1] != (crc32(seg[0]) % 100000)) return 0;

    // Constraint 3: (17 * seg[2]) mod 99991 == seg[0]
    // => seg[2] = seg[0] * modinv(17, 99991) mod 99991
    if ((17 * seg[2]) % 99991 != seg[0]) return 0;

    return 1;
}

# Keygen implementing the above constraints:
import binascii, math

def crc32_seg(n: int) -> int:
    data = str(n).encode()
    return binascii.crc32(data) & 0xFFFFFFFF

def modinv(a: int, n: int) -> int:
    return pow(a, -1, n)           # Python 3.8+

def generate_key(seed: int = 700) -> str:
    # Constraint 1: seg0 divisible by 7
    seg0 = seed - (seed % 7)      # round down to nearest multiple of 7
    # Constraint 2: CRC32-derived second segment
    seg1 = crc32_seg(seg0) % 100000
    # Constraint 3: modular inverse
    seg2 = (seg0 * modinv(17, 99991)) % 99991
    return f"{seg0:05d}-{seg1:05d}-{seg2:05d}"

print(generate_key(700))   # e.g. 00700-38471-05917

# Z3 approach for a multi-constraint validator (5+ constraints):
from z3 import BitVec, solve, UMod

k0, k1, k2 = BitVec('k0', 32), BitVec('k1', 32), BitVec('k2', 32)
solve(
    UMod(k0, 7) == 0,
    k0 >= 10000, k0 <= 99999,
    UMod(17 * k2, 99991) == k0,
    k1 >= 0, k1 <= 99999,
)

Reverse Engineering Methodology

Open the binary in Ghidra. Search strings for key-format hints (dashes, regex-like patterns). Cross-reference any sscanf or strtol calls to reach the parsing entry point.
Identify the segment count, allowed character set (decimal digits, Base32, hex), and separator format from the sscanf format string or explicit parser loop.
Trace each parsed segment into arithmetic operations. Note every % (modulo), *, +, and bitwise operator. Annotate in Ghidra's comment field.
Fingerprint any checksum: search Memory in Ghidra for 96 30 07 77 (little-endian CRC32 table entry 0x77073096). Presence of a 256-entry DWORD table confirms CRC32.
Model constraints in Python or z3-solver. For modular inverse constraints, confirm gcd(a, N) == 1 with math.gcd before calling pow(a, -1, N).
Write a keygen function and test against the binary with gdb: set a breakpoint at the final ret of the validation function, call the binary with your generated key, and confirm $eax == 1 (success).
In radare2, use axt @ sym.crc32 to find all callers of any suspected checksum routine, confirming it is only called from the validator (no other uses that might impose additional constraints).

Common Reverse Engineering Errors

Treating the first found valid key as full reversal: Manually guessing or brute-forcing a single key satisfies the challenge flag but provides no algorithmic understanding. Always document every constraint symbolically before writing the keygen.
Ignoring the gcd precondition for modular inverse: modinv(a, N) only exists when gcd(a, N) == 1. If the modulus is not prime and shares a factor with the multiplier, pow(a, -1, N) raises ValueError. Recognise this as evidence that the constraint has multiple solution classes, not that the constraint is impossible.
Misidentifying CRC32 as a custom hash: CRC32 is characterised by its 256-entry lookup table of 32-bit values. Seeing a loop that indexes a large DWORD table does not mean custom — search for the canonical first entry before concluding the algorithm is proprietary.
Parsing the wrong function as the validator: Binaries often contain multiple sanity checks before the real validator (format validation, length check, character-set validation). Ensure you have reached the final cryptographic constraint function, not an early-exit guard.
Not accounting for big-endian segment encoding: Some license validators read multi-byte segments in big-endian order. An offset-by-one error in byte ordering produces a key that passes manual calculation but fails the binary — always confirm byte order with GDB's x/4bx at the parsed variable address.
Z3 timeout on large BitVec widths: Z3 can time out on 64-bit BitVec with many constraints. Start with 32-bit and widen only if the constraint arithmetic overflows.

NICE Framework Alignment

Code	Knowledge/Skill/Task Statement	How This Card Develops It
K0168	Knowledge of assembly language	Reading arithmetic instruction sequences (IMUL, IDIV, CDQ) in the validator's assembly to identify modular arithmetic constraints
K0169	Knowledge of reverse engineering concepts and methodologies	Applying systematic constraint extraction: decompile → annotate → model → solve → verify
S0131	Skill in analyzing binary code	Identifying CRC32 tables by memory constant fingerprint and extracting modular constraint coefficients from Ghidra decompiler output
T0028	Conduct and/or support authorized penetration testing	Generating valid license tokens as part of software protection assessment to demonstrate algorithm weakness
T0286	Perform reverse engineering on software and/or firmware	Writing a fully algorithmic keygen from decompiled validation logic, proving complete understanding of the protection scheme