Skip to main content

Memory-safe archive extraction library with built-in security validation

Project description

exarch

PyPI Python CI License

Memory-safe archive extraction and creation library for Python.

Important: exarch is designed as a secure replacement for vulnerable archive libraries like Python's tarfile, which has known CVEs with CVSS scores up to 9.4.

This package provides Python bindings for exarch-core, a Rust library with built-in protection against common archive vulnerabilities.

Installation

pip install exarch

Tip: Use uv pip install exarch for faster installation.

Alternative Package Managers

# Poetry
poetry add exarch

# Pipenv
pipenv install exarch

Requirements

  • Python >= 3.9

Quick Start

Extraction

import exarch

result = exarch.extract_archive("archive.tar.gz", "/output/path")
print(f"Extracted {result.files_extracted} files")

Creation

import exarch

result = exarch.create_archive("backup.tar.gz", ["src/", "Cargo.toml"])
print(f"Created archive with {result.files_added} files")

Usage

Basic Extraction

import exarch

result = exarch.extract_archive("archive.tar.gz", "/output/path")

print(f"Files extracted: {result.files_extracted}")
print(f"Bytes written: {result.bytes_written}")
print(f"Duration: {result.duration_ms}ms")

With pathlib.Path

from pathlib import Path
import exarch

archive = Path("archive.tar.gz")
output = Path("/output/path")

result = exarch.extract_archive(archive, output)

Custom Security Configuration

import exarch

config = exarch.SecurityConfig()
config = config.max_file_size(100 * 1024 * 1024)  # 100 MB

result = exarch.extract_archive("archive.tar.gz", "/output", config)

Error Handling

import exarch

try:
    result = exarch.extract_archive("archive.tar.gz", "/output")
    print(f"Extracted {result.files_extracted} files")
except exarch.PathTraversalError as e:
    print(f"Blocked path traversal: {e}")
except exarch.ZipBombError as e:
    print(f"Zip bomb detected: {e}")
except exarch.SecurityViolationError as e:
    print(f"Security violation: {e}")
except exarch.ExtractionError as e:
    print(f"Extraction failed: {e}")

API Reference

extract_archive(archive_path, output_dir, config=None)

Extract an archive to the specified directory with security validation.

Parameters:

Name Type Description
archive_path str | Path Path to the archive file
output_dir str | Path Directory where files will be extracted
config SecurityConfig Optional security configuration

Returns: ExtractionReport

Attribute Type Description
files_extracted int Number of files extracted
directories_created int Number of directories created
symlinks_created int Number of symlinks created
bytes_written int Total bytes written
duration_ms int Extraction duration in milliseconds
files_skipped int Number of files skipped (e.g. duplicates)
warnings list[str] Warning messages generated during extraction

Raises:

Exception Description
PathTraversalError Path traversal attempt detected
SymlinkEscapeError Symlink points outside extraction directory
HardlinkEscapeError Hardlink target outside extraction directory
ZipBombError Potential zip bomb detected
QuotaExceededError Resource quota exceeded
SecurityViolationError Security policy violation
UnsupportedFormatError Archive format not supported
InvalidArchiveError Archive is corrupted
IOError I/O operation failed

SecurityConfig

Builder-style security configuration.

config = exarch.SecurityConfig()
config = config.max_file_size(100 * 1024 * 1024)    # 100 MB per file
config = config.max_total_size(1024 * 1024 * 1024)  # 1 GB total
config = config.max_file_count(10_000)               # Max 10k files
config = config.allow_solid_archives(True)           # Allow solid 7z archives

Security Features

The library provides built-in protection against:

Protection Description
Path traversal Blocks ../ and absolute paths
Symlink attacks Prevents symlinks escaping extraction directory
Hardlink attacks Validates hardlink targets
Zip bombs Detects high compression ratios
Permission sanitization Strips setuid/setgid bits
Size limits Enforces file and total size limits

Caution: Unlike Python's standard tarfile module, exarch applies security validation by default.

Supported Formats

Format Extensions Extract Create List Verify
TAR .tar
TAR+GZIP .tar.gz, .tgz
TAR+BZIP2 .tar.bz2, .tbz2
TAR+XZ .tar.xz, .txz
TAR+ZSTD .tar.zst, .tzst
ZIP .zip
7z .7z

Note: 7z creation is not yet supported. Solid and encrypted 7z archives are rejected for security reasons. Unix symlinks inside 7z archives are reported as regular files (sevenz-rust2 API limitation).

Comparison with tarfile

# UNSAFE - tarfile has known vulnerabilities (CVE-2007-4559)
import tarfile
with tarfile.open("archive.tar.gz") as tar:
    tar.extractall("/output")  # May extract outside target directory!

# SAFE - exarch validates all paths
import exarch
exarch.extract_archive("archive.tar.gz", "/output")  # Protected by default

Development

This package is built using PyO3 and maturin.

# Clone repository
git clone https://github.com/bug-ops/exarch
cd exarch/crates/exarch-python

# Build with maturin
pip install maturin
maturin develop

# Run tests
pytest tests/

Related Packages

License

Licensed under either of:

at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

exarch-0.3.1-cp39-abi3-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.9+Windows x86-64

exarch-0.3.1-cp39-abi3-musllinux_1_2_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

exarch-0.3.1-cp39-abi3-musllinux_1_2_aarch64.whl (1.5 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

exarch-0.3.1-cp39-abi3-manylinux_2_34_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.34+ x86-64

exarch-0.3.1-cp39-abi3-manylinux_2_34_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.34+ ARM64

exarch-0.3.1-cp39-abi3-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

exarch-0.3.1-cp39-abi3-macosx_10_12_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file exarch-0.3.1-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: exarch-0.3.1-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for exarch-0.3.1-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 07d8e4909c772e537613bcd254596e8f239088008e9e9c5257521de4f0279993
MD5 5d815787e9085d0f8719516d3548e5b9
BLAKE2b-256 1bb29bf71b77fc9753e4a3413053dd82905cd84067424b368f4dd01f9a3148ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-win_amd64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file exarch-0.3.1-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for exarch-0.3.1-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 42e2f54698090c3f1430513158b975a9f50aa8eda61504c146517640fe093c45
MD5 f4d074e9be836c5aec54b5e25669f6c7
BLAKE2b-256 c139bb6c31cb97543a4dd6fe0a193b038f2c8fc03c4741f023275560eacce73b

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file exarch-0.3.1-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for exarch-0.3.1-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 02f8ac429a27c4e86ad2bc76f651cb058516f271ddf0d313004df8417d231142
MD5 16dcf0e0cded81b22dce272b3c9144ec
BLAKE2b-256 030503bb9475e54c6d3a040bce8e0fb1fbf85e5a93beb19d2d22c7b0c0c3ceea

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file exarch-0.3.1-cp39-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for exarch-0.3.1-cp39-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 1041d442e39d9cdb07847e435935f9f8c27199e6d86926660bd56083aa160c54
MD5 66a9f16e8b85d7141ccdfb6c702df0cc
BLAKE2b-256 a00a60f4b27eb2a11f5edbec33225b65118fe8e6eb39b97a42ca1796f9c6c9fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-manylinux_2_34_x86_64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file exarch-0.3.1-cp39-abi3-manylinux_2_34_aarch64.whl.

File metadata

File hashes

Hashes for exarch-0.3.1-cp39-abi3-manylinux_2_34_aarch64.whl
Algorithm Hash digest
SHA256 50787540db515ed22ab7cba5a359a3b348d5d5a4703a35a8203998dffb300881
MD5 725d515130f473cbc784733700f6cd1d
BLAKE2b-256 e6bc573b739de7631fec35ed45c7db76fab9a0ae9e8960fa1ef1c3497bac5191

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-manylinux_2_34_aarch64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file exarch-0.3.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for exarch-0.3.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 47db55085caa7e0382ad7c0af5c8433cd17f246d54b35c1d440671140ee3ba6c
MD5 87f629294ddd806a37520595ebe4a22d
BLAKE2b-256 2d6a7445689f1bfb08bf7cb289ae5955276d122ef1f665675a5cb82a38fa7f9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file exarch-0.3.1-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for exarch-0.3.1-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6acefb6c48f25b1afe01d2e93daee45073f64364b906c2d2fe997225198756fb
MD5 cc7f1066950c80e2685a8f00fe46b593
BLAKE2b-256 46fcdabd6853ceb795f740615008176e9e94f0541d5ae81f59874c79ee56efc8

See more details on using hashes here.

Provenance

The following attestation bundles were made for exarch-0.3.1-cp39-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on bug-ops/exarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page