Skip to main content

extract files from the compressed file (.zip/.tar.gz/.gz)

Project description

pyextractme

📂 A Python utility for recursively extracting files matching a regex pattern from archives (.zip, .tar.gz, .gz)

Python Support License

✨ Features

  • 🔍 Extract files based on regular expression patterns
  • 📦 Supports .zip, .tar.gz, .tgz, and .gz archives
  • 🔄 Handles nested archives recursively (e.g., zip inside tar.gz)
  • 🎯 Simple and clear CLI interface using Typer
  • 🐍 Built with modern Python (3.12+)

📦 Installation

Using pip

pip install pyextractme

Using uv (Recommended)

uv is a blazing-fast Python package installer:

# Install uv (if you haven't already)
pip install uv

# Install pyextractme using uv
uv pip install pyextractme

🚀 Quick Start

Command Line Usage

# Basic usage: Extract all .txt files from my_archive.zip to the 'output' directory
pyextractme my_archive.zip "\.txt$" ./output/

# Extract specific log file from a tar.gz archive
pyextractme logs.tar.gz "app\.log" ./extracted_logs/

# Extract from a gzipped file (pattern matches the archive name for .gz)
pyextractme config.json.gz "config\.json\.gz" ./config_files/

Python Module Usage

python -m pyextractme [OPTIONS] INPUT_FILE TARGET_PATTERN OUTPUT_PATH

🎛️ Command Line Arguments

Argument Description Required
INPUT_FILE Path to the input archive file (.zip, .tar.gz, .tgz, .gz). Yes
TARGET_PATTERN Regular expression pattern to match filenames within the archive. Yes
OUTPUT_PATH Directory to extract matching files into (will be created if needed). Yes

🛠️ Development

Setup Development Environment

  1. Clone the repository

    git clone https://github.com/fxyzbtc/pyextractme.git
    cd pyextractme
    
  2. Create and activate virtual environment (using uv)

    uv venv .venv
    # On Windows
    .venv\Scripts\activate
    # On macOS/Linux
    source .venv/bin/activate
    
  3. Install dependencies (including development tools)

    uv pip install -e ".[dev]"
    

Running Tests

# Run all tests
uv run pytest

# Run tests with coverage report
uv run pytest --cov=pyextractme tests/

Code Style

This project uses black for formatting and isort for import sorting.

# Format code
uv run black .
uv run isort .

# Linting (if Ruff or similar is added later)
# uv run ruff check .

# Type checking (if MyPy is added later)
# uv run mypy .

📝 Example

Command:

pyextractme my_documents.zip "\.docx?$" ./extracted_docs/

Input: my_documents.zip containing:

- report.docx
- notes.txt
- archive/
  - presentation.pptx
  - backup.zip
    - important.doc

Output: Files extracted to ./extracted_docs/:

extracted_docs/
  - report.docx
  - important.doc

(Note: important.doc is extracted from the nested backup.zip)

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository.
  2. Create your feature branch (git checkout -b feature/your-feature).
  3. Commit your changes (git commit -m 'Add some feature').
  4. Push to the branch (git push origin feature/your-feature).
  5. Open a Pull Request.

📄 License

This project is licensed under the MIT License. (Please add a LICENSE file if one doesn't exist).

🙏 Acknowledgments

  • Typer for the easy-to-use CLI interface.
  • Python's built-in zipfile, tarfile, and gzip modules for archive handling.

📞 Support

  • 📫 Report issues on GitHub Issues
  • 💬 Ask questions or discuss ideas in the project's discussion forum (if available).

Made with ❤️ using Python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyextractme-0.1.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyextractme-0.1.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file pyextractme-0.1.0.tar.gz.

File metadata

  • Download URL: pyextractme-0.1.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.25

File hashes

Hashes for pyextractme-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7dc6e1da41a7805e76aaadd5eb3da5082f9d163a47b45b7b9083c3cb06f8406e
MD5 bdcbd16412411658e912868fcc5c5aaf
BLAKE2b-256 a47d7c9d78e4fbb952b663ab935d76af6a0ccb2cc0c05b4516a1812df33e686e

See more details on using hashes here.

File details

Details for the file pyextractme-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyextractme-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e4528dbc6248a9383deabe11773ce4e4a1f234a55c341babda519f50ad39b5e
MD5 62feac04300773385152b1c1490bfc36
BLAKE2b-256 26b6e5e29795fc151c3c52225aa60043a8c32233ad0eb6e2fed07fc7aada87a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page