Archive bomb (simulated)
Theory
Why This Matters
Archive bombs are a class of denial-of-service (DoS) attack that exploits the mismatch between the compressed size of a file and its expanded size. The canonical example, 42.zip (also called "the zip bomb"), is a 42-kilobyte file that, when fully extracted, expands to approximately 4.5 petabytes. Any application that accepts archive uploads and extracts them without enforcing expansion limits is vulnerable to resource exhaustion — consuming all available disk space, memory, or CPU — causing application downtime, host instability, or cascading failures. Real-world incidents include web application firewalls exhausting memory while scanning decompressed HTTP bodies, antivirus engines hanging on decompression, and mail servers crashing on receipt of archive attachments. CVE-2019-14271 (Docker) and multiple libarchive CVEs involve resource exhaustion during archive handling.
Core Concept
A zip bomb exploits the fact that the DEFLATE compression algorithm achieves very high compression ratios on repetitive data. A file consisting entirely of zero bytes compresses with a ratio exceeding 1000:1. A recursive zip bomb (the classic 42.zip design) nests archives: layer 1 contains 16 archives each 6KB; each layer-2 archive contains 16 archives each 6KB; ... 5 layers deep. The compressed size is tiny; the recursive expansion is exponential.
A non-recursive zip bomb (introduced by David Fifield in 2019) is more dangerous in practice because it works against extractors that detect recursion. It uses overlapping entries — multiple ZIP directory entries that reference the same compressed data at the same file offset. A 10-MB file can contain 1500 entries all pointing to the same 10-MB DEFLATE stream of zeros, expanding to 281 TB on extraction. This technique works against zip extractors that process entries linearly without deduplicating offsets.
Detection strategies:
- Check uncompressed size before extraction — The ZIP central directory header contains the
uncompressed sizefield (4 bytes). A sum of all uncompressed sizes exceeding a threshold (e.g., 500 MB) allows rejection before any extraction occurs. Note: this field is attacker-controlled and may be falsified; it is a hint, not a guarantee. - Compression ratio threshold — If
compressed_size / uncompressed_size < 0.02(i.e., compression ratio > 50x), treat the archive as suspicious. Legitimate archives rarely exceed 10:1 overall. - Byte limit on extraction — Enforce a hard limit on total bytes written during extraction (e.g., 100 MB). Abort and delete partial output if the limit is reached.
- Entry count limit — Limit the number of entries processed (e.g., 1000). Non-recursive bombs rely on large entry counts.
- Extraction time limit — Enforce a CPU time limit on the extraction process using
SIGXCPUor a watchdog thread.
A quine archive is an archive whose extracted content is itself (the same archive). It demonstrates the possibility of infinite recursive extraction but does not cause disk exhaustion in a single extraction step.
Technical Deep-Dive
# ── Detection: analyse a ZIP before extraction ────────────────────────────
import zipfile
MAX_UNCOMPRESSED_BYTES = 100 * 1024 * 1024 # 100 MB limit
MAX_COMPRESSION_RATIO = 50 # reject if ratio > 50x
MAX_ENTRY_COUNT = 1000
def safe_zip_check(zip_path: str) -> dict:
"""
Inspect a ZIP archive for archive bomb indicators.
Returns a dict with findings. Does NOT extract any data.
"""
findings = {"safe": True, "reason": None}
with zipfile.ZipFile(zip_path, "r") as zf:
infos = zf.infolist()
# Check entry count
if len(infos) > MAX_ENTRY_COUNT:
findings["safe"] = False
findings["reason"] = f"Entry count {len(infos)} exceeds limit {MAX_ENTRY_COUNT}"
return findings
total_compressed = sum(i.compress_size for i in infos)
total_uncompressed = sum(i.file_size for i in infos)
# Check total uncompressed size (may be falsified in central dir)
if total_uncompressed > MAX_UNCOMPRESSED_BYTES:
findings["safe"] = False
findings["reason"] = (
f"Declared uncompressed size {total_uncompressed/1e6:.1f} MB "
f"exceeds {MAX_UNCOMPRESSED_BYTES/1e6:.0f} MB limit"
)
return findings
# Check compression ratio
if total_compressed > 0:
ratio = total_uncompressed / total_compressed
if ratio > MAX_COMPRESSION_RATIO:
findings["safe"] = False
findings["reason"] = (
f"Compression ratio {ratio:.0f}x exceeds threshold {MAX_COMPRESSION_RATIO}x"
)
return findings
return findings # {"safe": True, "reason": None}
# ── Byte-limited extraction ───────────────────────────────────────────────
import os
def safe_extract(zip_path: str, dest: str) -> None:
"""Extract a ZIP with a byte-limit guard against runtime bombs."""
total_written = 0
os.makedirs(dest, exist_ok=True)
with zipfile.ZipFile(zip_path, "r") as zf:
for member in zf.infolist():
# Path traversal check (see Card 4: Zip Slip)
import pathlib
dest_path = (pathlib.Path(dest) / member.filename).resolve()
if not str(dest_path).startswith(str(pathlib.Path(dest).resolve())):
raise ValueError(f"Path traversal blocked: {member.filename}")
# Byte-limited write
with zf.open(member) as src, open(dest_path, "wb") as dst:
chunk_size = 65536
while True:
chunk = src.read(chunk_size)
if not chunk:
break
total_written += len(chunk)
if total_written > MAX_UNCOMPRESSED_BYTES:
os.remove(str(dest_path))
raise ValueError(
f"Extraction aborted: exceeded {MAX_UNCOMPRESSED_BYTES} bytes"
)
dst.write(chunk)
print(safe_zip_check("suspicious.zip"))
# ── Create a test recursive zip bomb (for controlled lab use only) ─────────
# Layer structure: file -> z1 -> z2 -> z3 (3 levels)
python3 - <<'EOF'
import zipfile, io
# Innermost: 1 MB of zeros compressed to ~1 KB
zeros_1mb = b"x00" * (1024 * 1024)
buf3 = io.BytesIO()
with zipfile.ZipFile(buf3, "w", zipfile.ZIP_DEFLATED) as z:
z.writestr("zeros.bin", zeros_1mb)
layer3 = buf3.getvalue()
buf2 = io.BytesIO()
with zipfile.ZipFile(buf2, "w", zipfile.ZIP_DEFLATED) as z:
for i in range(16):
z.writestr(f"l3_{i}.zip", layer3)
layer2 = buf2.getvalue()
with zipfile.ZipFile("bomb.zip", "w", zipfile.ZIP_DEFLATED) as z:
for i in range(16):
z.writestr(f"l2_{i}.zip", layer2)
import os
print(f"bomb.zip size: {os.path.getsize('bomb.zip')/1024:.1f} KB")
print(f"Declared expansion: {16*16*1:.0f} MB (3-level demo)")
EOF
# Check compression ratio with unzip -l
unzip -l bomb.zip | tail -5
# Column 1 is uncompressed size; compare with ls -lh bomb.zip
Security Assessment Methodology
- Locate archive upload endpoints — Identify all endpoints that accept
.zip,.tar,.gz,.bz2,.7z,.jar, or.waruploads. Include automatic archive fetching from URLs (update endpoints, plugin installers). - Submit a known zip bomb — Upload the EICAR zip bomb test file or a purpose-built 3-layer recursive bomb. Observe server behavior: does the request time out? Does the server return an error? Does disk usage spike?
- Test uncompressed size declared vs actual — Craft a ZIP with a falsified uncompressed size in the central directory (set to 1 byte) but actual content of 100 MB. Observe whether the server enforces extraction limits or relies on the declared size.
- Test entry count limits — Submit a ZIP with 5000 empty entries. If the server processes all entries, per-entry rate limits or entry count caps are absent.
- Measure response time and resource indicators — Compare response time between a normal upload and the bomb upload. Server-side timeouts, 503 responses, and memory error messages are indicators of unprotected extraction.
- Check for nested archive handling — Submit a 3-level nested ZIP. If the server recursively extracts inner archives, verify that recursion depth limits are enforced.
- Document impact — Record compression ratio, declared vs actual size, server response, and estimated resource consumption for the report.
Defensive Countermeasure — Enforce extraction limits at three levels: (1) reject archives where the declared total uncompressed size exceeds a configured threshold (e.g., 500 MB); (2) enforce a hard byte limit during extraction using a byte-counting wrapper around the decompressor; (3) limit entry count to a reasonable maximum (e.g., 1000). Extract archives in a sandboxed subprocess with a
ulimit -fdisk quota and CPU time limit (ulimit -t). Use libarchive witharchive_read_set_format_filter_countlimits where applicable.
Common Assessment Errors
- Using 42.zip without understanding recursive vs non-recursive — 42.zip is a recursive bomb; many modern extractors detect recursive archives and refuse to process them. Use both recursive and non-recursive (overlapping-entry) test files to cover both cases.
- Assuming the declared size in the central directory is reliable — It is attacker-controlled. An extractor that enforces limits only based on declared size can be bypassed by falsifying it to 0. Always enforce limits during actual byte extraction.
- Forgetting other archive formats — Tar.gz, bzip2, 7zip, and RAR all support analogous compression bombs. If the application accepts multiple formats, test each independently.
- Testing only on the upload endpoint — Some applications extract archives server-side as part of processing pipelines (e.g., loading a plugin, processing a report template). These pipeline steps may lack the upload size limits applied to the initial upload.
- Not verifying server-side resource consumption — A server that returns a 200 response after "processing" a bomb may have truncated extraction early (a good sign) or may be asynchronously processing it (a bad sign). Verify actual disk usage server-side if possible.
- Conflating zip bomb with Zip Slip — These are distinct vulnerabilities. A zip bomb exploits decompression resource limits; Zip Slip exploits path traversal. An archive can contain both. Test for both simultaneously.
NICE Framework Alignment
| Code | Knowledge/Skill/Task Statement | How This Card Develops It |
|---|---|---|
| K0009 | Knowledge of application vulnerabilities | Explains the recursive and non-recursive zip bomb mechanisms and detection strategies |
| K0070 | Knowledge of system and application security threats and vulnerabilities | Connects archive bombs to real-world DoS incidents in AV engines and WAFs |
| S0001 | Skill in conducting vulnerability scans and recognizing vulnerabilities in security systems | Trains systematic resource exhaustion testing across archive format variants |
| S0044 | Skill in mimicking threat behaviors to test defenses | Develops ability to craft multi-layer test bombs and analyze server responses |
| T0028 | Conduct and support authorized penetration testing on enterprise networks | Provides a stepwise methodology for archive bomb assessment with resource impact documentation |
| T0591 | Perform penetration testing as required for new or updated applications | Frames archive extraction limit testing as a required upload endpoint assessment step |
Further Reading
- Fifield, D. (2019). "A Better Zip Bomb" — bamsoftware.com (offline reference)
- CWE-409: Improper Handling of Highly Compressed Data — MITRE
- libarchive Upstream: archive_read_set_open_callback — libarchive.github.io Documentation
Challenge Lab
Reinforce your learning with a hands-on generated challenge based on this card's competency.