Skip to main content

A tool to efficiently extract and compress Git repository contents for LLMs.

Project description

Siphon

Efficiently extract, compress, and cache Git repository contexts for seamless integration with Large Language Models (LLMs).

License Python Versions PyPI Build Status Coverage Downloads Issues

Funded by YC (Your Cents) 😉


Table of Contents


Features

  • Efficient Extraction: Extracts and compresses repository contents while respecting .gitignore rules.
  • Customizable Filtering: Include or exclude files and directories with ease.
  • Multiple Output Formats: Supports text, tarball, and markdown formats optimized for LLM contexts.
  • Caching and Chunking: Pre-cache large repositories for faster querying.
  • Token Count Estimations: Get token counts for specific LLMs like GPT-3 and Claude.
  • Clipboard and Stdout Support: Streamline workflows with seamless copying options.
  • Modularity: Extend functionality with community-driven extensions.
  • Interactive Mode: Granular file selection through an interactive interface.

Installation

Install Siphon using pip:

pip install siphon-cli

Usage

Navigate to your Git repository and run:

si -o context.txt

This command extracts the repository content into context.txt.


Examples

  • Include Specific File Types:

    si -i "*.py" -o python_files.txt
    
  • Exclude Directories:

    si -e "tests/*" -o code_without_tests.txt
    
  • Interactive Mode:

    si --interactive -o selected_files.txt
    
  • Copy Output to Clipboard:

    si --clipboard
    

Arguments

  • path: Path to the Git repository (default: current directory).
  • -i, --include: Include file patterns (e.g., .py, src/).
  • -e, --exclude: Exclude file patterns (e.g., tests/, *.md).
  • -o, --output: Output file name (default: output.txt).
  • -f, --format: Output format (text, tar, markdown).
  • -c, --cache: Enable caching (future feature placeholder).
  • --tokenizer: Tokenizer for token count estimation (gpt3, claude).
  • --interactive: Interactive mode for file selection.
  • --clipboard: Copy output to clipboard.
  • --stdout: Print output to stdout.

Contributing

We welcome contributions from the community! To contribute:

  1. Fork the repository.

  2. Create a new branch:

    git checkout -b feature/your-feature-name
    
  3. Commit your changes:

    git commit -am 'Add a new feature'
    
  4. Push to the branch:

    git push origin feature/your-feature-name
    
  5. Open a Pull Request.

Please read our Contributing Guidelines for more details.


License

This project is licensed under the MIT License.


Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

siphon_cli-1.2.1.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

siphon_cli-1.2.1-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file siphon_cli-1.2.1.tar.gz.

File metadata

  • Download URL: siphon_cli-1.2.1.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for siphon_cli-1.2.1.tar.gz
Algorithm Hash digest
SHA256 0ffe5cba1c2eb4a564f42d677b8c073e4589d44faf4b716b3cc896d5cf01d6af
MD5 92c8d2f17cc8fde27d42e5d21579b717
BLAKE2b-256 ece04ee2d66b2991c78353c12f78dd625ecf32b3b302fe36c6e8256e9394cc02

See more details on using hashes here.

File details

Details for the file siphon_cli-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: siphon_cli-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for siphon_cli-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f0e7aa699f603ac0d6c1177727b66f1eb544f7294e029e1f79355341aba1d4c8
MD5 cbe143c34f20781b6c13723e96492cae
BLAKE2b-256 91613dfaacaa6cc09450562f8ad86de95d8263f2cb8ce922e1f6b1b53801f9c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page