Detecting Length-Extension and Forgery Flaws in Custom MAC Implementations

encoding_crypto_classical Difficulty 1–5 30 min certifiable

Theory

Why This Matters

Custom MAC constructions have caused real-world authentication bypasses wherever developers believed that "wrapping a hash around a key" was sufficient for message authentication. CVE-2009-3555, the TLS renegotiation vulnerability, stemmed partly from protocol-level confusion about what the MAC actually authenticated. More directly, the Flickr API (2009) and the Vimeo API used MAC = MD5(secret || params), which is trivially vulnerable to the length extension attack — an attacker can append arbitrary parameters and compute a valid MAC without knowing the secret. Python's hmac module documentation explicitly warns against naive constructions and mandates the use of hmac.compare_digest for timing-safe comparison. Any CTF challenge providing a web API or file format with a custom "hash-then-sign" or "secret-prefix" MAC scheme will test one of these four broken constructions.

Core Concept

A Message Authentication Code (MAC) must satisfy two properties: (1) only parties possessing the secret key can produce a valid tag, and (2) an attacker who can observe valid (message, tag) pairs cannot forge a valid tag for any new message. A secure MAC is a pseudorandom function (PRF) family indexed by the key.

Construction 1 — Secret prefix: MAC = H(key || message). This is broken by the length extension attack against Merkle-Damgård hash functions (MD5, SHA-1, SHA-256, SHA-512 — but not SHA-3 or BLAKE2). After processing, a Merkle-Damgård hash function's output is exactly its internal chaining state. An attacker who knows H(key || message) can feed this state back into the hash function and extend the computation: they compute H(key || message || padding || extension) without knowing the key. The resulting hash is a valid MAC for the extended message.

Construction 2 — Secret suffix: MAC = H(message || key). This is broken by collision attacks: if the attacker can find two messages m_1 and m_2 such that H(m_1) = H(m_2) (a hash collision), then any MAC tag valid for m_1 is also valid for m_2, regardless of the key.

Construction 3 — XOR-based: MAC = H(key XOR message) or H(key XOR pad || message) with weak pad. These are trivially forgeable through algebraic manipulation of the XOR operation; an attacker who knows one (message, MAC) pair can compute valid MACs for related messages.

Construction 4 — Encrypt-only: MAC = Enc_key(message). Block cipher encryption (ECB, CBC) does not provide authentication. ECB is malleable block-by-block; CBC-MAC is secure only for fixed-length messages without proper domain separation.

HMAC (RFC 2104) is the correct construction: HMAC(k, m) = H(k XOR opad || H(k XOR ipad || m)), where ipad = 0x36...36 and opad = 0x5C...5C. The nested structure ensures the outer hash takes as input a value that depends on both the key and the inner hash output, closing the length extension vulnerability. HMAC's security is provably reducible to the PRF property of the underlying compression function.

Timing side-channel: comparing MACs with a non-constant-time == operator leaks information about how many bytes of the correct tag match the attacker's guess. This allows byte-by-byte MAC forgery with O(256 · tag_length) queries. The fix is hmac.compare_digest(expected, received).

Technical Deep-Dive

# Demonstrating the four broken constructions and HMAC correctness
import hashlib, hmac, struct

key = b"supersecretkey"
msg = b"amount=100&user=alice"

# --- Construction 1: secret prefix (BROKEN — length extension) ---
def mac_prefix(key, msg):
    return hashlib.sha256(key + msg).hexdigest()

# Attack: extend msg with &admin=true without knowing key
# (See card crypto.protocol-flaws.hmac-length-extension.v1 for full attack code)
# attacker knows: mac_prefix(key, msg) and len(key)+len(msg)
# attacker can compute: mac_prefix(key, msg || padding || b"&admin=true")

# --- Construction 2: secret suffix (BROKEN — collision forgery) ---
def mac_suffix(key, msg):
    return hashlib.md5(msg + key).hexdigest()

# If H(m1) == H(m2), then mac_suffix(key, m1) == mac_suffix(key, m2)
# MD5 collision pairs are publicly known (Wang et al. 2004)

# --- Construction 3: XOR-based (BROKEN — trivially forgeable) ---
def mac_xor(key, msg):
    # Pad key/msg to same length; XOR; hash
    padded_key = (key * ((len(msg) // len(key)) + 1))[:len(msg)]
    xored = bytes(a ^ b for a, b in zip(padded_key, msg))
    return hashlib.sha256(xored).hexdigest()

# Attack: given (msg, tag), compute tag for msg XOR delta XOR delta = msg
# Knowing the XOR structure reveals the keystream; any bit flip in msg
# is predictable and the attacker can adjust the input to produce the same tag.

# --- Construction 4: encrypt-only CBC-MAC (BROKEN for variable length) ---
from Crypto.Cipher import AES
def cbc_mac(key, msg):
    # Zero-pad msg to 16-byte boundary
    pad_len = (16 - len(msg) % 16) % 16
    msg_padded = msg + b'x00' * pad_len
    cipher = AES.new(key[:16], AES.MODE_CBC, iv=b'x00' * 16)
    ct = cipher.encrypt(msg_padded)
    return ct[-16:]  # last ciphertext block

# --- CORRECT: HMAC ---
def mac_correct(key, msg):
    return hmac.new(key, msg, hashlib.sha256).digest()

def verify_correct(key, msg, tag):
    expected = mac_correct(key, msg)
    # CRITICAL: timing-safe comparison
    return hmac.compare_digest(expected, tag)

# hash_extender tool for automated length extension attack
# https://github.com/iagox86/hash_extender
./hash_extender --data "amount=100&user=alice" 
                --secret 14 
                --append "&admin=true" 
                --signature <known_mac_hex> 
                --format sha256

# Output: new_signature and new_string to submit to the server

Cryptanalysis Methodology

Identify the MAC construction — Read the challenge source or API documentation. Look for H(key + msg), H(msg + key), custom XOR constructions, or encrypt-only tags.
Test for length extension — If the construction is H(key || msg) with a Merkle-Damgård hash, use hash_extender or the hlextend Python library to extend the known message and forge a valid tag for the extended payload.
Test for collision-based forgery — If the construction is H(msg || key) with MD5 or SHA-1, search for or use published collision pairs (e.g., the Shattered SHA-1 collisions) to substitute an equivalent message that produces the same MAC.
Test timing oracle — Submit MACs that differ in the first byte, measure response time. A timing difference confirms non-constant-time comparison, enabling byte-by-byte brute-force with O(256 · len) queries.
Test CBC-MAC with variable-length messages — Submit two messages m_1 and m_2 where the CBC-MAC of m_1 equals the first block of m_2; the CBC-MAC of the concatenation m_1 || m_2' can be forged.
Forge and submit — Construct the forged (message, tag) pair and submit it to the challenge endpoint. Verify the server accepts it.

Secure Implementation Note — Use HMAC with a modern hash (SHA-256 or SHA-3-256) for all message authentication: hmac.new(key, message, hashlib.sha256).digest(). Always compare tags with hmac.compare_digest() to prevent timing attacks. Keys must be generated with a CSPRNG and be at least 256 bits. If authenticated encryption is needed, use AES-GCM or ChaCha20-Poly1305 instead of a separate MAC.

Common Cryptanalysis Errors

Confusing HMAC with H(key || msg) — HMAC uses a specific double-hash construction; H(key || msg) is the broken secret-prefix construction. They are not the same even if both involve a key and a hash.
Attempting length extension against SHA-3 or BLAKE2 — These are not Merkle-Damgård; they use sponge or HAIFA constructions and are immune to length extension. Only target MD5, SHA-1, SHA-256, SHA-512.
Wrong secret length in hash_extender — The tool requires the exact length of the secret key in bytes. Guessing wrong produces an invalid extended hash. Try all plausible lengths (e.g., 8–32 bytes).
Ignoring the padding bytes in the extended message — The extended payload includes SHA padding bytes (0x80 followed by length encoding) between the original message and the extension. The server must accept these bytes; if input validation rejects non-printable characters, the attack may be blocked.
Not checking for timing vulnerability — Even with HMAC, a non-constant-time comparison makes the tag brute-forceable. Always test the comparison path separately from the construction.
Treating CBC-MAC as broken unconditionally — CBC-MAC with a fixed message length is secure. The vulnerability arises specifically with variable-length inputs lacking proper domain separation or padding.

NICE Framework Alignment

Code	Knowledge/Skill/Task Statement	How This Card Develops It
K0007	Knowledge of authentication, authorisation, and access control methods	Demonstrates how broken MAC constructions undermine message authentication and access control decisions
K0018	Knowledge of encryption algorithms and their weaknesses	Explains the algebraic weaknesses in non-standard MAC constructions at the hash-function level
K0019	Knowledge of cryptography and cryptographic key management	Connects MAC security to PRF properties and the necessity of correct key usage
K0074	Knowledge of network security protocols	Contextualises MAC failures in API authentication and protocol-level message integrity schemes
S0138	Skill in using public-key infrastructure tools	Trains use of hash_extender and hlextend for automated length extension against web APIs
T0259	Use cryptanalysis tools to recover plaintext from ciphertext	Provides methodology for identifying, exploiting, and forging custom MAC constructions