Skip to main content

Interactively explore a codebase with an LLM

Project description

explore

Interactively explore a codebase with an LLM

PyPI - Version

explore is a script to interactively explore a codebase by chatting with an LLM. It uses retrieval-augmented generation via chromadb to provide the LLM with relevant source code from the codebase.

explore uses OpenAI models by default, so you'll need an OpenAI API key.

Installation

explore is available on PyPI. I recommend installing it with pipx:

pipx install explore-cli
export OPENAI_API_KEY=<your OpenAI API key>
explore <directory>

Alternatively, you can clone this repository and run the script with poetry:

poetry install
poetry build
export OPENAI_API_KEY=<your OpenAI API key>
poetry run explore <directory>

Usage

usage: explore [-h] [--skip-index] [--no-ignore] [--documents-only] [--question QUESTION] [--no-progress-bar]
               [--index-only]
               directory

Interactively explore a codebase with an LLM.

positional arguments:
  directory            The directory to index and explore.

options:
  -h, --help           show this help message and exit
  --skip-index         skip indexing the directory (warning: if the directory hasn't been indexed at least once, it
                       will be indexed anyway)
  --no-ignore          Disable respecting .gitignore files
  --documents-only     Only print documents, then exit. --question must be provided
  --question QUESTION  Initial question to ask (will prompt if not provided)
  --no-progress-bar    Disable progress bar
  --index-only         Only index the directory

Configuration

There are a couple of environment variables you can set to configure explore:

Name Description
OPENAI_API_KEY Required. Your API key for the OpenAI API
OPENAI_BASE_URL The base URL used for OpenAI API requests. You can set this to use any OpenAI-compatible APIs (e.g. Ollama to run models locally). Default: https://api.openai.com/v1
OPENAI_MODEL Which model to tell the OpenAI API to use. The default is gpt-4o-mini, which strikes a good balance between coherence and price. You can get better results if you set this to gpt-4o, but bear in mind explore can generate extremely long prompts so that could get expensive quickly

How it works

  1. The directory is indexed into a local Chroma store. Only files that have been modified since the last time they were indexed get re-indexed, so this step will be quite slow on the first execution but pretty quick after that.

  2. Documents relevant to the query are collected in three ways:

    1. The question is embedded as a vector and used to search for the nearest matches in the Chroma DB
    2. The entire conversation so far is embedded as a vector and used to search for more matches in Chroma
    3. Search keywords are extracted from the question and used to find exact matching text in the indexed documents

    By default, explore will fetch 4 documents using the first approach, 3 using the second and 4 using the third.

  3. The documents are deduplicated, concatenated and prepended to the ongoing conversation, then the latest question is appended. The whole thing is sent to the LLM, which returns an answer to the question based on the provided documents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

explore_cli-0.2.3.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

explore_cli-0.2.3-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file explore_cli-0.2.3.tar.gz.

File metadata

  • Download URL: explore_cli-0.2.3.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0

File hashes

Hashes for explore_cli-0.2.3.tar.gz
Algorithm Hash digest
SHA256 5f85d28f73fbd6af085171c85734e02cd0ad90a97514a7c857ad447e15bc3864
MD5 a32bd1299b7f14328c0b30f678134b51
BLAKE2b-256 e5506dfbc9b96c101dd15852b6b76602652eef28066b85891e8be0636a450438

See more details on using hashes here.

File details

Details for the file explore_cli-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: explore_cli-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0

File hashes

Hashes for explore_cli-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 01ed30e01ba7ddbce013bb576140aa2f1958458dd16cdd1f354b43562843bcad
MD5 12dc20271cc6157c7131c8616d0a104b
BLAKE2b-256 5c772bdea32f6c4865aa6e6e12798a1f33b9c6053696aa24ad7a271d26611a4d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page