Interactively explore a codebase with an LLM
Project description
explore
Interactively explore a codebase with an LLM
explore
is a script to interactively explore a codebase by chatting with an LLM. It uses retrieval-augmented generation via chromadb
to provide the LLM with relevant source code from the codebase.
explore
uses OpenAI models by default, so you'll need an OpenAI API key.
Installation
explore
is available on PyPI. I recommend installing it with pipx
:
pipx install explore-cli
export OPENAI_API_KEY=<your OpenAI API key>
explore <directory>
Alternatively, you can clone this repository and run the script with poetry
:
poetry install
poetry build
export OPENAI_API_KEY=<your OpenAI API key>
poetry run explore <directory>
Usage
usage: explore [-h] [--skip-index] [--no-ignore] [--documents-only] [--question QUESTION] [--no-progress-bar]
[--index-only]
directory
Interactively explore a codebase with an LLM.
positional arguments:
directory The directory to index and explore.
options:
-h, --help show this help message and exit
--skip-index skip indexing the directory (warning: if the directory hasn't been indexed at least once, it
will be indexed anyway)
--no-ignore Disable respecting .gitignore files
--documents-only Only print documents, then exit. --question must be provided
--question QUESTION Initial question to ask (will prompt if not provided)
--no-progress-bar Disable progress bar
--index-only Only index the directory
Configuration
There are a couple of environment variables you can set to configure explore
:
Name | Description |
---|---|
OPENAI_API_KEY |
Required. Your API key for the OpenAI API |
OPENAI_BASE_URL |
The base URL used for OpenAI API requests. You can set this to use any OpenAI-compatible APIs (e.g. Ollama to run models locally). Default: https://api.openai.com/v1 |
OPENAI_MODEL |
Which model to tell the OpenAI API to use. The default is gpt-4o-mini , which strikes a good balance between coherence and price. You can get better results if you set this to gpt-4o , but bear in mind explore can generate extremely long prompts so that could get expensive quickly |
How it works
-
The directory is indexed into a local Chroma store. Only files that have been modified since the last time they were indexed get re-indexed, so this step will be quite slow on the first execution but pretty quick after that.
-
Documents relevant to the query are collected in three ways:
- The question is embedded as a vector and used to search for the nearest matches in the Chroma DB
- The entire conversation so far is embedded as a vector and used to search for more matches in Chroma
- Search keywords are extracted from the question and used to find exact matching text in the indexed documents
By default,
explore
will fetch 4 documents using the first approach, 3 using the second and 4 using the third. -
The documents are deduplicated, concatenated and prepended to the ongoing conversation, then the latest question is appended. The whole thing is sent to the LLM, which returns an answer to the question based on the provided documents.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file explore_cli-0.2.3.tar.gz
.
File metadata
- Download URL: explore_cli-0.2.3.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f85d28f73fbd6af085171c85734e02cd0ad90a97514a7c857ad447e15bc3864 |
|
MD5 | a32bd1299b7f14328c0b30f678134b51 |
|
BLAKE2b-256 | e5506dfbc9b96c101dd15852b6b76602652eef28066b85891e8be0636a450438 |
File details
Details for the file explore_cli-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: explore_cli-0.2.3-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 01ed30e01ba7ddbce013bb576140aa2f1958458dd16cdd1f354b43562843bcad |
|
MD5 | 12dc20271cc6157c7131c8616d0a104b |
|
BLAKE2b-256 | 5c772bdea32f6c4865aa6e6e12798a1f33b9c6053696aa24ad7a271d26611a4d |