Skip to main content

Security-focused archive extraction tool with policy enforcement

Project description

zipguard

Security-focused archive extraction with policy enforcement. Prevents Zip Slip, archive bombs, symlink abuse, executable drops, and ZIP64 manipulation — before writing anything to disk.

$ zipguard Autoruns.zip --out ./analysis
Extracting Autoruns.zip → ./analysis

  Decision    File               Reason
 ────────────────────────────────────────────────────────────────────────
  RENAMED     Autoruns.exe       executable extension blocked by policy (.exe)
  RENAMED     autorunsc.exe      executable extension blocked by policy (.exe)
  BLOCKED     ../../evil.txt     Path traversal detected
  BLOCKED     payload.pdf.exe    Double extension spoofing detected

  2 allowed  2 renamed  2 blocked

Why not just use safezip?

safezip is a Python library — a drop-in replacement for zipfile that adds ZipSlip and ZIP bomb protection. It's a great choice if you're building an application and want safe extraction in your code.

zipguard is different. It's a CLI tool designed for humans and pipelines working with untrusted archives. The core difference:

Feature safezip zipguard
Interface Python library CLI + library
Executable blocking (.exe, .ps1, .bat…)
RTLO filename spoofing detection
Double extension detection (doc.pdf.exe)
SHA-256 audit log per file
Rename-vs-block mode (.exe.exe.blocked)
Human-readable decision table
--dry-run / --verbose / JSON output
Atomic writes (no partial files on abort)
ZIP64 consistency checks
ZipSlip protection
ZIP bomb protection
Recursive nested ZIP extraction
Zero dependencies

Use safezip if you need a lightweight, zero-dependency library embedded in your application.

Use zipguard if you're a security analyst, DevOps engineer, or CI pipeline that needs to inspect and audit untrusted ZIP files with full visibility into every decision made.

What zipguard blocks

Attack Standard tools zipguard
Zip Slip (../../evil.exe) Extracts to parent dir Blocked
Absolute path (/etc/passwd) Extracts to root Blocked
Archive bomb (42.zip) Fills disk Aborted
Symlink pointing outside dir Follows link Blocked
document.pdf.exe Extracts as-is Blocked
RTLO filename spoofing Extracts as-is Blocked
Forged size metadata Trusts metadata Counts real bytes
ZIP64 size inconsistency Trusts header Aborted
Duplicate entry names Unpredictable Aborted
.exe, .ps1, .bat drops Extracts as-is Renamed/blocked

Install

pip install git+https://github.com/Mhacker1020/zipguard.git

PyPI release coming soon.

Usage

Basic extraction

zipguard archive.zip
zipguard archive.zip --out ./output_dir

Dry run — analyze without extracting

zipguard archive.zip --dry-run --verbose

JSON output — for automation and CI/CD

zipguard archive.zip --format json
zipguard archive.zip --format json --log audit.json

Custom limits

zipguard archive.zip --max-size 50MB
zipguard archive.zip --block-ext .exe,.ps1,.lnk

Policy config file

zipguard archive.zip --config policy.json

All options

zipguard <archive> [options]

  --out, -o PATH        Output directory (default: ./extracted)
  --dry-run             Analyze without extracting
  --config, -c FILE     Policy config file (JSON)
  --verbose, -v         Show all entries including allowed ones
  --format [table|json] Output format (default: table)
  --log FILE            Save JSON audit log to file
  --max-size SIZE       Max file size, e.g. 100MB, 500KB, 1GB
  --block-ext EXTS      Comma-separated extensions to block
  --version             Show version

Exit codes

Code Meaning
0 All entries allowed
1 One or more entries blocked or renamed
2 Extraction aborted (archive bomb, malformed archive)

Use exit codes in CI/CD pipelines to fail builds on suspicious archives.

Policy config

Create a policy.json to enforce consistent rules across your team or pipeline:

{
  "max_file_size": 104857600,
  "max_total_size": 524288000,
  "max_files": 1000,
  "max_compression_ratio": 100,
  "block_extensions": [".exe", ".dll", ".ps1", ".js", ".lnk", ".vbs", ".bat", ".cmd"],
  "rename_blocked": true,
  "allow_symlinks": false,
  "allow_overwrite": false,
  "scan_hashes": true,
  "block_rtlo": true,
  "block_double_extension": true,
  "block_ambiguous_archives": true
}
Field Default Description
max_file_size 100 MB Max uncompressed size per file
max_total_size 500 MB Max total extracted size
max_files 1000 Max number of files
max_compression_ratio 100× Abort if compression ratio exceeds this
block_extensions See below File extensions to block
rename_blocked true Rename blocked files (.exe.exe.blocked) instead of hard-blocking
allow_symlinks false Allow symlinks and hardlinks
allow_overwrite false Allow overwriting existing files
scan_hashes true Compute SHA-256 for each extracted file
block_rtlo true Block Unicode Right-to-Left Override filename tricks
block_double_extension true Block document.pdf.exe style spoofing
block_ambiguous_archives true Abort if archive has duplicate entry names or ZIP64 inconsistencies

Default blocked extensions: .exe .dll .sys .drv .ps1 .psm1 .psd1 .bat .cmd .com .vbs .vbe .js .jse .wsf .wsh .lnk .pif .scr .msi .msp .msc .hta .cpl

Use as a library

from pathlib import Path
from zipguard import SafeExtractor, ExtractionPolicy

policy = ExtractionPolicy(
    max_file_size=50 * 1024 * 1024,  # 50 MB
    block_extensions=[".exe", ".ps1"],
    rename_blocked=True,
)

extractor = SafeExtractor(policy)
report = extractor.extract(Path("archive.zip"), Path("./output"))

print(f"Allowed: {report.allowed_count}")
print(f"Blocked: {report.blocked_count}")
print(report.to_json())

Integration with gate

zipguard works alongside gate, a supply chain security scanner. Use gate to validate packages from registries, and zipguard to safely unpack archive files before inspection:

from zipguard import SafeExtractor, ExtractionPolicy

# Safely unpack a downloaded .whl or .tar.gz before analyzing contents
policy = ExtractionPolicy(rename_blocked=False)  # hard block in CI
report = SafeExtractor(policy).extract(wheel_path, staging_dir)

if report.aborted or report.blocked_count > 0:
    raise SecurityError(f"Unsafe archive: {report.abort_reason or 'blocked entries'}")

CI/CD example

# GitHub Actions
- name: Extract and validate artifact
  run: |
    pip install git+https://github.com/Mhacker1020/zipguard.git
    zipguard artifact.zip --out ./artifact --format json --log audit.json
    # Exits with code 1 if any entries were blocked

What it does NOT do

  • Replace antivirus or EDR
  • Prevent execution of extracted files
  • Scan file contents for malware (use alongside ClamAV or YARA for that)

Threat model

Detailed threat model, architecture, and security design decisions are documented in SECURITY.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zipguard-0.2.0.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zipguard-0.2.0-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file zipguard-0.2.0.tar.gz.

File metadata

  • Download URL: zipguard-0.2.0.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for zipguard-0.2.0.tar.gz
Algorithm Hash digest
SHA256 45ed39646b311ed57dca1ab08d601669ebc547fe5effc3d9c9b035c5f584a931
MD5 10c72b6deae201679ab7cf8f3e4247f4
BLAKE2b-256 fd2196b928abe92461ee67886085ee74b35caf107e9dcd26732cc554f565ec5c

See more details on using hashes here.

File details

Details for the file zipguard-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: zipguard-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for zipguard-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8c71b6b75aeb0205632496791a6f27bf0461ceb51d1240da6c5ec2de92eb499
MD5 9476559dcd22ff7924a89c160fc8943e
BLAKE2b-256 8d5022ddb780c6b9a7668e023049980cc7c74040a1902b6c923ec168c30f8c7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page