Universal archive extractor supporting 30+ formats including ZIP, RAR, 7Z, TAR, and many more
Project description
UnzipAll - Universal Archive Extractor
A comprehensive Python library for extracting archive files in 30+ formats with a simple, unified API. No more juggling multiple extraction libraries or dealing with format-specific quirks.
โจ Features
- ๐๏ธ Universal Format Support: ZIP, RAR, 7Z, TAR (all variants), ISO, MSI, and 25+ more formats
- ๐ก๏ธ Security First: Built-in protection against path traversal attacks (zip bombs)
- ๐ Password Support: Handle encrypted archives seamlessly
- โก Simple API: One function call to extract any supported archive
- ๐ง CLI Tool: Extract archives from command line with
unzipallcommand - ๐ Cross-Platform: Works on Windows, macOS, and Linux
- ๐๏ธ Type Safe: Full type hints for better IDE support and development experience
- ๐ Graceful Degradation: Optional dependencies - missing libraries don't break functionality
๐ Quick Start
Installation
pip install unzipall
Basic Usage
import unzipall
# Extract any archive format - it just works!
unzipall.extract('archive.zip')
unzipall.extract('data.tar.gz', 'output_folder')
unzipall.extract('encrypted.7z', password='secret')
# Check if format is supported
if unzipall.is_supported('mystery_file.xyz'):
unzipall.extract('mystery_file.xyz')
# List all supported formats
formats = unzipall.list_supported_formats()
print(f"Supports {len(formats)} formats!")
Command Line Usage
# Extract to current directory
unzipall archive.zip
# Extract to specific directory
unzipall archive.tar.gz /path/to/output
# Extract password-protected archive
unzipall -p mypassword encrypted.7z
# List supported formats
unzipall --list-formats
# Verbose output
unzipall -v archive.rar output_dir
๐ Supported Formats
| Category | Formats | Status |
|---|---|---|
| ZIP Family | .zip, .jar, .war, .ear, .apk, .epub, .cbz |
โ Built-in |
| RAR Family | .rar, .cbr |
โ Full Support |
| 7-Zip | .7z, .cb7 |
โ Full Support |
| TAR Archives | .tar, .tar.gz, .tgz, .tar.bz2, .tbz2, .tar.xz, .txz, .tar.z, .tar.lzma |
โ Built-in |
| Compression | .gz, .bz2, .xz, .lzma, .z |
โ Built-in |
| Other Archives | .arj, .cab, .chm, .cpio, .deb, .rpm, .lzh, .lha |
โ Via patool |
| Disk Images | .iso, .vhd, .udf |
โ Full Support |
| Microsoft | .msi, .exe (self-extracting), .wim |
โ Platform-aware |
| Specialized | .xar, .zpaq, .cso, .pkg, .cbt |
โ Via patool |
30+ formats supported! If a format isn't working, it may require additional system tools (see System Dependencies).
๐ Advanced Usage
Programmatic API
from unzipall import ArchiveExtractor, ArchiveExtractionError
# Create extractor with custom settings
extractor = ArchiveExtractor(verbose=True)
# Check available features
features = extractor.get_available_features()
for feature, available in features.items():
status = "โ
" if available else "โ"
print(f"{status} {feature}")
# Extract with error handling
try:
success = extractor.extract(
archive_path='large_archive.rar',
extract_to='output_directory',
password='optional_password'
)
if success:
print("Extraction completed successfully!")
except ArchiveExtractionError as e:
print(f"Extraction failed: {e}")
Error Handling
UnzipAll provides specific exceptions for different failure scenarios:
from unzipall import (
ArchiveExtractionError, UnsupportedFormatError,
CorruptedArchiveError, PasswordRequiredError,
InvalidPasswordError, ExtractionPermissionError,
DiskSpaceError
)
try:
unzipall.extract('archive.zip')
except UnsupportedFormatError:
print("This archive format is not supported")
except PasswordRequiredError:
password = input("Enter password: ")
unzipall.extract('archive.zip', password=password)
except CorruptedArchiveError:
print("Archive file is corrupted")
except DiskSpaceError:
print("Not enough disk space")
except ArchiveExtractionError as e:
print(f"Extraction failed: {e}")
Security Features
UnzipAll automatically protects against common archive-based attacks:
# Path traversal protection (zip bombs)
# Malicious archives with paths like "../../etc/passwd" are safely handled
unzipall.extract('potentially_malicious.zip', 'safe_output_dir')
# Files are extracted only within the target directory
# Dangerous paths are logged and skipped
๐ง System Dependencies
While UnzipAll works out of the box for common formats (ZIP, TAR, GZIP, etc.), some formats require additional system tools:
Windows
# Install via Windows Package Manager
winget install 7zip.7zip
winget install RARLab.WinRAR
# Or install via Chocolatey
choco install 7zip winrar
macOS
# Using Homebrew
brew install p7zip unrar
# For additional formats
brew install cabextract unshield
Linux (Ubuntu/Debian)
sudo apt update
sudo apt install p7zip-full unrar-free
# For additional formats
sudo apt install cabextract unshield cpio
Linux (RHEL/CentOS/Fedora)
sudo dnf install p7zip p7zip-plugins unrar
# For additional formats
sudo dnf install cabextract unshield cpio
Docker Usage
FROM python:3.11-slim
# Install system dependencies
RUN apt-get update && apt-get install -y \
p7zip-full \
unrar-free \
cabextract \
&& rm -rf /var/lib/apt/lists/*
# Install unzipall
RUN pip install unzipall
# Your application code
COPY . /app
WORKDIR /app
๐ Performance
UnzipAll is designed for reliability and format support over raw speed. Benchmarks on typical archives:
- ZIP files: ~80 extractions/second
- TAR.GZ files: ~60 extractions/second
- 7Z files: ~40 extractions/second
- RAR files: ~35 extractions/second
Performance varies based on archive size, compression ratio, and available system resources.
๐งช Development & Testing
# Clone the repository
git clone https://github.com/mricardo/unzipall.git
cd unzipall
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=src/unzipall --cov-report=html
# Format code
black src/ tests/
# Type checking
mypy src/
# Lint code
flake8 src/ tests/
Running Specific Tests
# Test basic functionality
pytest tests/test_smoke.py -v
# Test specific archive format
pytest tests/test_core.py::test_extract_valid_zip -v
# Performance benchmarks
pytest tests/test_performance.py --benchmark-only
# Skip slow tests
pytest -m "not slow"
๐ API Reference
Main Functions
extract(archive_path, extract_to=None, password=None, verbose=False)
Extract an archive to the specified directory.
Parameters:
archive_path(str|Path): Path to the archive fileextract_to(str|Path, optional): Output directory (defaults to archive stem name)password(str, optional): Password for encrypted archivesverbose(bool): Enable detailed logging
Returns: bool - True if successful
Example:
# Extract to default location (archive stem name)
unzipall.extract('myfiles.zip') # Creates ./myfiles/
# Extract to specific directory
unzipall.extract('myfiles.zip', 'custom_output')
# Extract encrypted archive
unzipall.extract('secret.7z', password='mypassword')
is_supported(file_path)
Check if a file format is supported.
Parameters:
file_path(str|Path): Path to file to check
Returns: bool - True if format is supported
Example:
if unzipall.is_supported('data.xyz'):
print("This format is supported!")
else:
print("Unsupported format")
list_supported_formats()
Get list of all supported file extensions.
Returns: List[str] - Sorted list of supported extensions
Example:
formats = unzipall.list_supported_formats()
print(f"Supported: {', '.join(formats)}")
ArchiveExtractor Class
For advanced usage with custom configuration:
from unzipall import ArchiveExtractor
extractor = ArchiveExtractor(verbose=True)
# Check what features are available
features = extractor.get_available_features()
# Extract with custom settings
success = extractor.extract('archive.zip', 'output_dir')
Exception Hierarchy
ArchiveExtractionError (base)
โโโ UnsupportedFormatError
โโโ CorruptedArchiveError
โโโ PasswordRequiredError
โโโ InvalidPasswordError
โโโ ExtractionPermissionError
โโโ DiskSpaceError
๐ค Contributing
Contributions are welcome! Here's how to get started:
- Fork the repository on GitHub
- Create a feature branch:
git checkout -b feature/amazing-feature - Install development dependencies:
pip install -e ".[dev]" - Make your changes and add tests
- Run the test suite:
pytest - Commit your changes:
git commit -m "Add amazing feature" - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Development Guidelines
- Write tests for new features
- Follow PEP 8 style guidelines (use
blackfor formatting) - Add type hints for new functions
- Update documentation for API changes
- Ensure all tests pass before submitting
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Built on top of excellent libraries:
py7zr,rarfile,patool,libarchive-c, and others - Inspired by the need for a simple, unified archive extraction interface
- Thanks to all contributors and users who help improve this library
๐ Related Projects
- patool - Command-line archive tool
- py7zr - Pure Python 7-zip library
- rarfile - RAR archive reader
- zipfile - Python standard library ZIP support
๐ Support
- Documentation: Check this README and docstrings
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: ricardo.lee.cm@gmail.com
Star this repo if you find it useful! โญ
Made with โค๏ธ by mricardo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unzipall-1.0.1.tar.gz.
File metadata
- Download URL: unzipall-1.0.1.tar.gz
- Upload date:
- Size: 19.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
360afbaa59f956bdc071d193f851eb111e1fcaf636eeb305386e6d5d983c5daa
|
|
| MD5 |
37892dbc75815449104bc1a121d1817e
|
|
| BLAKE2b-256 |
f1e774d7f663cfa2548f5642b662c5fa7aa97aee94c786ea4c37a4e37a2f3de6
|
File details
Details for the file unzipall-1.0.1-py3-none-any.whl.
File metadata
- Download URL: unzipall-1.0.1-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dc7c645bbdda5d05f4af76f6f8031bb4c173835bf7312397cfbc2a5734f09e8
|
|
| MD5 |
902f1bea58548079e099082d09a9f11e
|
|
| BLAKE2b-256 |
b5cd64da418a024d488b1f960f23439b6b4a2acc1053e7e5e59a45a54dcddca4
|