Password check

reverse_engineering Difficulty 1–5 30 min certifiable

Theory

Why This Matters

Password validation logic is one of the most common targets in malware analysis, license enforcement research, and exploit development. Malware families routinely embed hardcoded C2 authentication passwords, operator passwords, or configuration decryption keys inside their binaries. Ransomware samples have been cracked by recovering the hardcoded seed passed to a weak PRNG. In vulnerability research, identifying how a firmware image validates administrative credentials opens the path to authentication bypass — a pattern seen repeatedly in consumer routers and industrial control systems. In CTF competition, the password-check archetype is the entry point to nearly every reverse-engineering track, teaching the fundamental skill of reading a binary's logic without source code.

Core Concept

A password check in a compiled binary reduces to one of three families of comparison: direct string comparison, hash-and-compare, or encoding-then-compare. Understanding which family you are dealing with determines the correct extraction strategy.

Direct string comparison calls functions such as strcmp, strncmp, or memcmp. The correct password often appears in plaintext in the .rodata section and is immediately visible to the strings command. Custom XOR comparison iterates over the input bytes and XORs each against a key byte (possibly cycling) before comparing the result to an encoded reference buffer. Key recovery is possible with known-plaintext analysis: if you know what one output byte must be, XOR it with the corresponding encoded byte to recover the key byte.

Hash identification by magic constant is the most reliable fingerprinting approach when the algorithm is not obfuscated. The MD5 initializer loads four 32-bit constants: 0x67452301, 0xEFCDAB89, 0x98BADCFE, and 0x10325476. SHA-1 initializes five 32-bit words starting with the same 0x67452301 but adding 0xC3D2E1F0. SHA-256 identifies by its array of 64 round constants beginning 0x428a2f98. Searching a binary for these constants with Ghidra's Memory Search or grep over a hexdump pinpoints the hash implementation immediately.

The cross-reference (xref) technique ties everything together: you locate the string or hash constant, then use Ghidra's References → Show References to to find every function that touches it, tracing backward to the validation caller.

Technical Deep-Dive

// Typical decompiled password check (Ghidra output, cleaned):
// Three comparison families side-by-side for reference.

// --- Family 1: strcmp (plaintext) ---
int check_password_strcmp(char *input) {
    // "SuperSecret" lives in .rodata — visible to strings(1)
    return strcmp(input, "SuperSecret") == 0;
}

// --- Family 2: XOR-encoded comparison ---
// Encoded reference: { 0x39, 0x18, 0x05, 0x1e, 0x0a }
// Key (repeating):   { 0x59, 0x79, 0x61, 0x79, 0x69 }  ("YyayI" cycling)
// Plaintext:         { 0x60^0x59, 0x61^0x79, ... } => "hello"
int check_password_xor(char *input) {
    static unsigned char encoded[] = {0x39, 0x18, 0x05, 0x1e, 0x0a};
    static unsigned char key[]     = {0x59, 0x79, 0x61, 0x79, 0x69};
    for (int i = 0; i < 5; i++) {
        if ((input[i] ^ key[i]) != encoded[i]) return 0;
    }
    return 1;
}

// --- Family 3: hash-and-compare (SHA-256 constant fingerprint) ---
// SHA-256 round constant K[0] = 0x428a2f98 located at 0x00404020
// If you see this in Memory Search => SHA-256 implementation nearby

# XOR key recovery from known-plaintext (Python):
# We know the first byte of the correct password is 'A' (0x41)
encoded_byte = 0x18   # byte 0 of the encoded buffer
known_plain   = 0x41  # first char of guessed plaintext
key_byte      = encoded_byte ^ known_plain
print(hex(key_byte))  # => 0x59 — confirms key byte

# Bulk decryption once key is fully recovered:
encoded = bytes([0x39, 0x18, 0x05, 0x1e, 0x0a])
key     = bytes([0x59, 0x79, 0x61, 0x79, 0x69])
plain   = bytes(e ^ k for e, k in zip(encoded, key))
print(plain.decode())  # => hello

# Quick-win pipeline before opening a disassembler:
strings -n 6 target_binary | grep -E '[A-Za-z0-9_!@#$%]{6,}'
# Then confirm with:
objdump -s -j .rodata target_binary | less

Reverse Engineering Methodology

Run checksec --file=target_binary to note protections (PIE, stripped symbols). Stripped binaries require more manual xref work.
Run strings -n 6 target_binary and pipe to grep for likely passwords, hash digests, or encoding keys. Note any candidate strings.
Open the binary in Ghidra. Auto-analyze. Navigate to Search → Memory and search for the byte pattern of hash initializers (67 45 23 01 for MD5/SHA-1, 98 2f 8a 42 little-endian for SHA-256 K[0]).
Use Ghidra References (right-click any found constant or string → Show References To) to reach the comparison or hash function.
Identify the comparison function: strcmp/memcmp family → extract plaintext argument. XOR loop → recover key via known-plaintext. Hash → note algorithm, look for the expected digest constant.
In radare2, confirm: aaa; afl~check lists functions. pdf @ sym.check_password disassembles. iz lists strings in the binary.
Set a GDB breakpoint at the comparison site: b *0x00401234; r; x/s $rsi to read both sides of the compare at runtime. This bypasses any obfuscation that only resolves at run time.
If a custom XOR: write a Python decryption script using recovered key, verify the output is printable ASCII.

Common Reverse Engineering Errors

Stopping at strings output without verifying: Strings often returns false positives from format strings, debug symbols, or library names. Always cross-reference any candidate string against actual code xrefs in Ghidra before treating it as the password.
Missing hash identification because the algorithm is inlined: Compilers inline small cryptographic functions. If you see a loop with bitwise rotations and a large constant array but no external call, search for the first constant from the array rather than looking for a function named sha256.
Assuming strcmp when the check is memcmp: memcmp does not stop at null bytes, meaning the "password" may contain embedded nulls. Always check the length argument and dump the full buffer, not just the string up to the first null.
Confusing key byte with encoded byte during XOR recovery: When recovering an XOR key from known-plaintext, you XOR the encoded byte with the known plaintext byte. Reversing the operands gives the same result mathematically but causes confusion when documenting the key; always label clearly.
Skipping dynamic analysis when static analysis stalls: Obfuscated or packed binaries decode their comparison target at runtime. Breakpointing at the actual cmp or call strcmp instruction and reading registers with x/s $rdi in GDB reveals the live plaintext that static analysis missed.
Not accounting for little-endian byte order when searching for hash constants: The SHA-256 constant 0x428a2f98 is stored in memory as 98 2f 8a 42 on x86. Searching for 42 8a 2f 98 (big-endian) will return no results. Always confirm byte order before constructing a memory search pattern.

NICE Framework Alignment

Code	Knowledge/Skill/Task Statement	How This Card Develops It
K0168	Knowledge of assembly language	Reading CMP/JZ/CALL sequences around password comparison logic at the assembly level
K0169	Knowledge of reverse engineering concepts and methodologies	Applying static + dynamic dual-method workflow: Ghidra decompilation followed by GDB runtime verification
S0131	Skill in analyzing binary code	Extracting plaintext passwords, XOR keys, and hash algorithm identifiers from stripped ELF binaries
T0028	Conduct and/or support authorized penetration testing	Recovering hardcoded credentials from firmware/malware binaries to demonstrate authentication bypass risk
T0286	Perform reverse engineering on software and/or firmware	End-to-end binary analysis workflow: strings → disassembly → decompilation → dynamic confirmation