Skip to main content

Easily copy all relevant source files in a repository to clipboard

Project description

repogather

repogather is a command-line tool that copies all relevant files (with their relative paths) in a repository to the clipboard. It is intended to be used in LLM code understanding or code generation workflows. It uses gpt-4o-mini (configurable) to decide file relevance, but can also be used without an LLM to return all files, with non-AI filters (such as excluding tests or config files).

Features

  • Filters and analyzes code files in a repository
  • Excludes test and configuration files by default (with options to include them)
  • Estimates token count and API usage cost before processing
  • Uses OpenAI's GPT models to evaluate file relevance
  • Supports various methods of providing the OpenAI API key
  • Copies relevant files and their contents to the clipboard
  • Can return all files without LLM analysis

Installation

Install repogather using pip:

pip install repogather

Setup

Set up your OpenAI API key using one of the following methods:

  • As an environment variable: export OPENAI_API_KEY=your_api_key_here
  • In a .env file in your current working directory:
    OPENAI_API_KEY=your_api_key_here
    
  • Provide it as a command-line argument when running the tool (see Usage section)

Usage

After installation, you can run repogather from the command line:

repogather [QUERY] [OPTIONS]

Options

  • --include-test: Include test files in the analysis
  • --include-config: Include configuration files in the analysis
  • --relevance-threshold THRESHOLD: Set the relevance threshold (0-100, default: 50)
  • --model MODEL: Specify the OpenAI model to use (default: gpt-4o-mini-2024-07-18)
  • --openai-key KEY: Provide the OpenAI API key directly
  • --all: Return all files without using LLM analysis

Examples

  1. Analyze files with a query:

    repogather "Find files related to user authentication" --include-config --relevance-threshold 70 --model gpt-4o-2024-08-06
    

    This command will:

  2. Search for files related to user authentication

  3. Include configuration files in the search

  4. Only return files with a relevance score of 70 or higher

  5. Use the GPT-4o model from August 2024 for analysis

  6. Return all files without LLM analysis:

    repogather --all --include-test --include-config
    

    This command will:

  7. Gather all code files in the repository

  8. Include test and config files in the output (if present, inferred from file extension)

  9. Copy all gathered files to the clipboard without using LLM analysis

How It Works

repogather performs the following steps:

  1. Scans the current directory and its subdirectories for code files
  2. Filters out test and configuration files (unless included via options)
  3. If --all option is used, returns all filtered files
  4. Otherwise: a. Counts the tokens in the filtered files and estimates the API usage cost b. Asks for user confirmation before proceeding c. Sends the file contents and the query to the specified OpenAI model d. Processes the model's response to rank files by relevance e. Filters the files by the specified relevance threshold
  5. Copies the relevant file paths and contents to the clipboard

Note

repogather requires an active OpenAI API key when using LLM analysis. It will prompt you to confirm the expected cost of the query (in input tokens) before proceeding. When using the --all option, no API key is required.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repogather-0.0.1.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repogather-0.0.1-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file repogather-0.0.1.tar.gz.

File metadata

  • Download URL: repogather-0.0.1.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.5

File hashes

Hashes for repogather-0.0.1.tar.gz
Algorithm Hash digest
SHA256 01053838eec2b492920e64d921d3d7895b778d21ecae718cf8a198206ff2e136
MD5 b59b9c66f2c0346428b149e34e4cb5a6
BLAKE2b-256 43570ee1220787aaca7dd1b0fe90cf41cae9f702a490bdc628ec1d2673c71925

See more details on using hashes here.

File details

Details for the file repogather-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: repogather-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.5

File hashes

Hashes for repogather-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c13f730b1f76883b368a4dadbe306c0daaec68deb130cd6c922b354b3f1e35a1
MD5 7815e396eeeab4f61c116ef4bb679b91
BLAKE2b-256 7104c6eace7081afd489e783d1280bb1c1e3823a42b440e244e7556c1b6442c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page