Skip to main content

Recursively extract ZIP, 7z, tar, RAR, and single-file compressed archives from directory trees

Project description

unarch

unarch

Recursively extract ZIP, 7z, tar, RAR, and single-file compressed archives from directory trees

PyPI version Python versions License

Features

CI

  • Recursive discovery — finds all supported archives in a directory tree
  • Password list support — tries passwords from a wordlist for encrypted ZIP, 7z, and RAR archives
  • Path traversal protection — rejects absolute paths and .. sequences uniformly across all formats
  • Folder structure preservation — extracts each archive into its own named subfolder
  • Single-file compression support — extracts .gz, .bz2, and .xz payloads without shelling out
  • Custom output directory — extract to a location separate from the source tree
  • Configurable output naming — append a suffix and skip archives whose destination is already populated
  • Rich progress output — styled progress indicators and results tables via rich
  • Library API — use programmatically in your own Python projects

Quick Start

pip install unarch
unarch /path/to/search

Installation

Install with pip:

pip install unarch

Install as an isolated CLI tool with pipx:

pipx install unarch

Install with uv:

uv tool install unarch

RAR Support

RAR extraction requires the unrar system binary in addition to the rarfile Python package:

# macOS
brew install unrar

# Debian / Ubuntu
apt install unrar

CLI Reference

unarch [OPTIONS] PATH
Flag Short Description
--dry-run List archives found without extracting
--passwords FILE Password wordlist (one per line) for encrypted archives
--output-dir DIR Base directory for extraction output
--verbose -v Print each file path during extraction
--quiet -q Suppress all output
--version Show installed version and exit

Examples

# Preview archives before extracting
unarch --dry-run /path/to/search

# Extract with a password wordlist
unarch /path/to/search --passwords passwords.txt

# Extract to a custom output directory
unarch /path/to/search --output-dir /path/to/output

# Verbose extraction showing each file
unarch -v /path/to/search

# Quiet mode (no output)
unarch -q /path/to/search

# Show version
unarch --version

Library Usage

from unarch import extract_archives, list_archives, __version__

# List archives without extracting
archives = list_archives("/path/to/search")
# [{"path": "/path/to/a.zip", "type": "zip", "member_count": 42}, ...]

# Extract all archives in a directory
results = extract_archives("/path/to/search")

# Extract with a password list
results = extract_archives("/path/to/search", passwords=["pass1", "pass2"])

# Extract to a custom output directory
results = extract_archives("/path/to/search", output_dir="/path/to/output")

# Match existing destination naming conventions
results = extract_archives(
    "/path/to/search",
    output_dir="/path/to/output",
    output_suffix="_archive",
    skip_existing=True,
)

extract_archives() returns a dictionary mapping archive paths to extracted file counts. A count of -1 indicates failure.

Supported Formats

Extension Format Password Support Backend
.zip ZIP Yes stdlib
.7z 7-Zip Yes py7zr
.tar Tar No stdlib
.tar.gz, .tgz Tar + Gzip No stdlib
.tar.bz2, .tbz2, .tbz Tar + Bzip2 No stdlib
.tar.xz, .txz Tar + XZ No stdlib
.gz Gzip-compressed file No stdlib
.bz2 Bzip2-compressed file No stdlib
.xz XZ-compressed file No stdlib
.rar RAR Yes rarfile + unrar

Security

unarch enforces path safety uniformly across all formats via validate_member_path():

  • Absolute path rejection — skips members with absolute paths (e.g. /etc/passwd)
  • Traversal detection — rejects any path containing .. components
  • Real-path check — resolves the final path and confirms it stays within the output directory
  • Symlink skipping — tar archives skip symlink and hardlink members entirely
  • 7z two-pass validation — member paths are listed and validated before extraction begins; only safe members are extracted

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unarch-0.6.0.tar.gz (228.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unarch-0.6.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file unarch-0.6.0.tar.gz.

File metadata

  • Download URL: unarch-0.6.0.tar.gz
  • Upload date:
  • Size: 228.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for unarch-0.6.0.tar.gz
Algorithm Hash digest
SHA256 83c58d8e0354e3397355e0e424c1f9ffb1294090160f8d477c4824d26ab75f80
MD5 9c0f587db2e791d438add629a54a1afb
BLAKE2b-256 504f9b8660941941fce84bae9c08c87b829de118fd086f7ace32b6b663e0c4dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for unarch-0.6.0.tar.gz:

Publisher: release.yml on tsilva/unarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unarch-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: unarch-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for unarch-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29f4028c9339cbbe34abf0ba1bb576dbb90c2b3895c1724a2dec622790804b42
MD5 a96c3a73c3643aa91ab7d60ed60298bc
BLAKE2b-256 0570ae31a08f78b810d710572902a06f5e0d8893bde1b008ff0cf26c50c650ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for unarch-0.6.0-py3-none-any.whl:

Publisher: release.yml on tsilva/unarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page