License key generation
Theory
Why This Matters
License key reversal sits at the intersection of software protection research, intellectual property law, and practical exploit development. Security researchers routinely analyze commercial software key validation algorithms as part of authorised assessments — vendors sometimes use the same flawed keygen logic in both consumer software and industrial SCADA components, where an attacker who can generate valid "license" tokens can unlock hidden diagnostic modes or escalate privileges. Malware families including Mirai variants have adopted serial-number-style tokens to gate access to C2 panels, and recovering the generation algorithm lets defenders generate decoy tokens for honeypot attribution. In CTF competition, the license-key archetype teaches constraint solving and algorithmic thinking that directly transfers to firmware authentication research.
Core Concept
A license key is a structured token that encodes constraints a validator checks. The validator reads segments of the token, applies arithmetic transformations, and checks that the result equals an expected value. The reverser's goal is to understand those constraints well enough to generate arbitrary valid tokens — a keygen.
Checksum schemes fall into three categories commonly seen in binaries. The Luhn algorithm (used in credit card numbers) computes a weighted digit sum and checks divisibility by 10. CRC32 applies a polynomial-division algorithm against a 256-entry lookup table; the table is the fingerprint — search memory for 0x77073096 (the first non-zero CRC32 table entry). Custom checksums are the most common CTF variant: a loop that accumulates a sum, product, or XOR of key characters, then checks the final value against a hardcoded expected result.
Modular arithmetic constraints appear whenever the validator checks (key_value % N) == R. Reversing this means solving for any key_value satisfying the congruence — infinitely many solutions exist, so keygen is trivial: pick any multiple of N and add R. When the constraint is (a * x) % N == b, recovering x requires the modular inverse: x = b * modinv(a, N) (valid only when gcd(a, N) == 1). Python's pow(a, -1, N) computes this directly in Python 3.8+.
Z3 SMT solver handles complex multi-constraint systems: model the key bytes as BitVec variables, add each validator check as an assertion, call solve(), and extract any satisfying assignment. This replaces manual algebra for validators with five or more interacting constraints.
The crucial methodological distinction: finding a valid key answers the CTF challenge; understanding the algorithm produces a keygen and demonstrates full reversal — the higher-value skill in professional engagements.
Technical Deep-Dive
// Decompiled Ghidra output — typical 3-segment license validator:
// Key format: XXXXX-YYYYY-ZZZZZ (groups of 5 decimal digits)
int validate_license(char *key) {
int seg[3];
// parse three 5-digit segments
sscanf(key, "%5d-%5d-%5d", &seg[0], &seg[1], &seg[2]);
// Constraint 1: first segment must be divisible by 7
if (seg[0] % 7 != 0) return 0;
// Constraint 2: second segment is CRC32 of first (mod 100000)
if (seg[1] != (crc32(seg[0]) % 100000)) return 0;
// Constraint 3: (17 * seg[2]) mod 99991 == seg[0]
// => seg[2] = seg[0] * modinv(17, 99991) mod 99991
if ((17 * seg[2]) % 99991 != seg[0]) return 0;
return 1;
}
# Keygen implementing the above constraints:
import binascii, math
def crc32_seg(n: int) -> int:
data = str(n).encode()
return binascii.crc32(data) & 0xFFFFFFFF
def modinv(a: int, n: int) -> int:
return pow(a, -1, n) # Python 3.8+
def generate_key(seed: int = 700) -> str:
# Constraint 1: seg0 divisible by 7
seg0 = seed - (seed % 7) # round down to nearest multiple of 7
# Constraint 2: CRC32-derived second segment
seg1 = crc32_seg(seg0) % 100000
# Constraint 3: modular inverse
seg2 = (seg0 * modinv(17, 99991)) % 99991
return f"{seg0:05d}-{seg1:05d}-{seg2:05d}"
print(generate_key(700)) # e.g. 00700-38471-05917
# Z3 approach for a multi-constraint validator (5+ constraints):
from z3 import BitVec, solve, UMod
k0, k1, k2 = BitVec('k0', 32), BitVec('k1', 32), BitVec('k2', 32)
solve(
UMod(k0, 7) == 0,
k0 >= 10000, k0 <= 99999,
UMod(17 * k2, 99991) == k0,
k1 >= 0, k1 <= 99999,
)
Reverse Engineering Methodology
- Open the binary in Ghidra. Search strings for key-format hints (dashes, regex-like patterns). Cross-reference any
sscanforstrtolcalls to reach the parsing entry point. - Identify the segment count, allowed character set (decimal digits, Base32, hex), and separator format from the
sscanfformat string or explicit parser loop. - Trace each parsed segment into arithmetic operations. Note every
%(modulo),*,+, and bitwise operator. Annotate in Ghidra's comment field. - Fingerprint any checksum: search Memory in Ghidra for
96 30 07 77(little-endian CRC32 table entry0x77073096). Presence of a 256-entry DWORD table confirms CRC32. - Model constraints in Python or z3-solver. For modular inverse constraints, confirm
gcd(a, N) == 1withmath.gcdbefore callingpow(a, -1, N). - Write a keygen function and test against the binary with
gdb: set a breakpoint at the finalretof the validation function, call the binary with your generated key, and confirm$eax == 1(success). - In radare2, use
axt @ sym.crc32to find all callers of any suspected checksum routine, confirming it is only called from the validator (no other uses that might impose additional constraints).
Common Reverse Engineering Errors
- Treating the first found valid key as full reversal: Manually guessing or brute-forcing a single key satisfies the challenge flag but provides no algorithmic understanding. Always document every constraint symbolically before writing the keygen.
- Ignoring the
gcdprecondition for modular inverse:modinv(a, N)only exists whengcd(a, N) == 1. If the modulus is not prime and shares a factor with the multiplier,pow(a, -1, N)raisesValueError. Recognise this as evidence that the constraint has multiple solution classes, not that the constraint is impossible. - Misidentifying CRC32 as a custom hash: CRC32 is characterised by its 256-entry lookup table of 32-bit values. Seeing a loop that indexes a large DWORD table does not mean custom — search for the canonical first entry before concluding the algorithm is proprietary.
- Parsing the wrong function as the validator: Binaries often contain multiple sanity checks before the real validator (format validation, length check, character-set validation). Ensure you have reached the final cryptographic constraint function, not an early-exit guard.
- Not accounting for big-endian segment encoding: Some license validators read multi-byte segments in big-endian order. An offset-by-one error in byte ordering produces a key that passes manual calculation but fails the binary — always confirm byte order with GDB's
x/4bxat the parsed variable address. - Z3 timeout on large BitVec widths: Z3 can time out on 64-bit
BitVecwith many constraints. Start with 32-bit and widen only if the constraint arithmetic overflows.
NICE Framework Alignment
| Code | Knowledge/Skill/Task Statement | How This Card Develops It |
|---|---|---|
| K0168 | Knowledge of assembly language | Reading arithmetic instruction sequences (IMUL, IDIV, CDQ) in the validator's assembly to identify modular arithmetic constraints |
| K0169 | Knowledge of reverse engineering concepts and methodologies | Applying systematic constraint extraction: decompile → annotate → model → solve → verify |
| S0131 | Skill in analyzing binary code | Identifying CRC32 tables by memory constant fingerprint and extracting modular constraint coefficients from Ghidra decompiler output |
| T0028 | Conduct and/or support authorized penetration testing | Generating valid license tokens as part of software protection assessment to demonstrate algorithm weakness |
| T0286 | Perform reverse engineering on software and/or firmware | Writing a fully algorithmic keygen from decompiled validation logic, proving complete understanding of the protection scheme |
Further Reading
- Reversing: Secrets of Reverse Engineering — Eldad Eilam, Chapter 9: Copy Protection (Wiley)
- The Art of Exploitation, 2nd Edition — Jon Erickson (No Starch Press)
- Z3 Tutorial — Leonardo de Moura & Nikolaj Bjørner, Microsoft Research (MSR-TR-2012)
Challenge Lab
Reinforce your learning with a hands-on generated challenge based on this card's competency.