Skip to main content

Search your code repository using GPT4.

Project description

Repo GPT

Repo-GPT is a Python CLI tool designed to utilize the power of OpenAI's GPT-3 model. It facilitates the process of code analysis and search within your repositories.

Repo-GPT in action

Features

  • Code extraction and processing from your repositories.
  • Semantic search within your codebase through natural language queries.
  • Response generation to natural language queries about your code.
  • Specific file analysis within your codebase.

Installation

Repo-GPT can be installed via pip:

brew install graphviz
pip install repo-gpt

Alternatively, you can clone and install from the source code:

git clone https://github.com/yourusername/repo-gpt.git
cd repo-gpt
poetry install

Setting Up

Before starting, make sure to set up your OpenAI key in your environment variables.

export OPENAI_API_KEY=<insert your openai key>

To set up Repo-GPT, run the following command at the root of the project you want to search. This will create a .repo_gpt directory and store the code embeddings there:

repo-gpt setup

Repo-GPT will only add or update embeddings for new files or changed files. You can rerun the setup command as many times as needed.

Usage

After setup, you can perform various tasks:

  • Semantic Search: Find semantically similar code snippets in your codebase:

    repo-gpt search <text/question>
    
  • Codebase Query: Ask questions about your codebase:

    repo-gpt query <text/question>
    
  • File Analysis: Analyze a specific file:

    repo-gpt analyze <file_path>
    
  • Help: Access the help guide:

    repo-gpt help
    
  • Generate tests: Generate tests for a function: Note: this assumes the function name is unique in the codebase, otherwise, it will pick the first function it finds with that name.

    repo-gpt add-test <unique function name> --test_save_file_path <absolute filepath to add tests to> --testing_package <testing package to use e.g. pytest>
    

Example:

repo-gpt setup --root_path ./my_project
repo-gpt search "extract handler"
repo-gpt query "What does the function `calculate_sum` do?"
repo-gpt analyze ./my_project/main.py
repo-gpt add-test function_name --test_save_file_path $PWD/test.py --testing_package pytest

Contributing

We welcome your contributions! Before starting, please make sure to install Python 3.11 and the latest version of poetry. Pyenv is a convenient tool to manage multiple Python versions on your computer.

Here are the steps to set up your development environment: 0. Install global dependencies:

nvm use --lts

brew install graphviz
export CFLAGS="-I $(brew --prefix graphviz)/include"
export LDFLAGS="-L $(brew --prefix graphviz)/lib"
pip install poetry
  1. Export your OpenAI key to your environment variables:

    export OPENAI_API_KEY=<insert your openai key>
    
  2. Install dependencies:

    poetry install --no-root
    jupyter lab build
    
  3. Install pre-commit hooks:

    poetry run pre-commit install
    
  4. Seed data:

    poetry run python cli.py setup
    
  5. Query data:

    poetry run python cli.py search <text/question>
    

Testing

Integration Tests

Run pytest with the --language option to filter tests by language:

# Run only Python tests
pytest --language python test/it
# Run only TypeScript tests
pytest --language typescript test/it
# Run only PHP tests
pytest --language php test/it

If the --language option is omitted, all tests will be run:

# Run all tests (default behavior)
pytest test/it

Unit

pytest test/unit

Debugging

You can view the output of the code_embeddings.pkl using the following command:

poetry shell
python
import pandas as pd
pd.read_pickle('./.repo_gpt/code_embeddings.pkl', compression='infer')

Interpreter

poetry shell
ipython
%load_ext autoreload
%autoreload 2

Roadmap

Here are the improvements we are currently considering:

  • Publishing to PyPi
  • Test suite addition
  • Add CI/CD
  • Prettify output
  • Add readme section about how folks can contribute parsers for their own languages
  • Save # of tokens each code snippet has so we can ensure we don't pass too many tokens to GPT
  • Add SQL file handler
  • Add DBT file handler -- this may be a break in pattern as we'd want to use the manifest.json file
  • Create VSCode extension
  • Ensure files can be added & deleted and the indexing picks up on the changes.
  • Add .repogptignore file to config & use it in the indexing command
  • Use pygments library for prettier code formatting

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repo_gpt-0.4.2.tar.gz (38.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repo_gpt-0.4.2-py3-none-any.whl (52.1 kB view details)

Uploaded Python 3

File details

Details for the file repo_gpt-0.4.2.tar.gz.

File metadata

  • Download URL: repo_gpt-0.4.2.tar.gz
  • Upload date:
  • Size: 38.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.12 Linux/6.11.0-1014-azure

File hashes

Hashes for repo_gpt-0.4.2.tar.gz
Algorithm Hash digest
SHA256 44426c967f97948f1f484a1d4c4c05ad8b34387b5c5c38d78249035a79b7d505
MD5 4de4e75278d27b7d95d0fc7d5311a33b
BLAKE2b-256 e2fb947bcd622572789cb75d85415c598bafa80c560b5e24fa9c6c0691811b44

See more details on using hashes here.

File details

Details for the file repo_gpt-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: repo_gpt-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 52.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.12 Linux/6.11.0-1014-azure

File hashes

Hashes for repo_gpt-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 26063c969c4e546436ba7c9493153d8c4d7e09434714b15a00850c018de0d20f
MD5 861a158e95f253fa4ae8a6eddacf18eb
BLAKE2b-256 c6bc5b498526dd96ea846ccc05afedc67051cd35d5e6bd713678041c1ad9e6ba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page