Skip to main content

Interactively explore a codebase with an LLM

Project description

explore

Interactively explore a codebase with an LLM

PyPI - Version

explore is a script to interactively explore a codebase by chatting with an LLM. It uses retrieval-augmented generation via chromadb to provide the LLM with relevant source code from the codebase.

explore uses OpenAI models by default, so you'll need an OpenAI API key.

Installation

explore is available on PyPI. I recommend installing it with pipx:

pipx install explore-cli
export OPENAI_API_KEY=<your OpenAI API key>
explore <directory>

Alternatively, you can clone this repository and run the script with poetry:

poetry install
poetry build
export OPENAI_API_KEY=<your OpenAI API key>
poetry run explore <directory>

Usage

usage: explore [-h] [-l LLM] [-m MODEL] directory

Interactively explore a codebase with an LLM.

positional arguments:
  directory             The directory to index and explore.

options:
  -h, --help            show this help message and exit
  -l LLM, --llm LLM     The LLM backend, one of openai, ollama, or azure. Default: openai. If using Azure, make sure to
                        set the AZURE_OPENAI_ENDPOINT and OPENAI_API_VERSION environment variables.
  -m MODEL, --model MODEL
                        The LLM model to use. Default: gpt-4o-mini for openai, mistral-nemo:latest for ollama, or
                        gpt-4o for azure.

How it works

  1. The codebase is indexed into a local Chroma store. Each file is split into chunks using language-specific separators.
  2. Documents relevant to the query are collected using multiple retrieval strategies:
    • Primary retrieval is done through vector similarity using the indexed embeddings.
    • A multi-query retriever issues multiple variations of the query to increase the diversity and relevance of retrieved documents.
    • Additionally, a history-aware retriever reformulates the user query, considering the conversation history to better capture context.
  3. Retrieved documents are deduplicated, concatenated, and added as context to the LLM, which generates an answer to the user's question. Answers include specific references to the files and code pertinent to the query.

Using Azure OpenAI

explore can connect to an Azure OpenAI instance using Azure Active Directory authentication. First, set the relevant environment variables (you can find the values to use for these in the Azure portal):

export AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com # <- endpoint for the Azure OpenAI instance
export OPENAI_API_VERSION=2024-10-01-preview # <- API version for the deployment you want to use

Make sure you are authenticated via the Azure CLI:

az login

When you invoke explore, pass the Azure OpenAI deployment name as the --model argument and specify azure as the --llm:

explore --llm azure --model gpt-4o /some/directory

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

explore_cli-0.3.4.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

explore_cli-0.3.4-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file explore_cli-0.3.4.tar.gz.

File metadata

  • Download URL: explore_cli-0.3.4.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.8 Darwin/23.5.0

File hashes

Hashes for explore_cli-0.3.4.tar.gz
Algorithm Hash digest
SHA256 c0ad6906ea16777b4bee36afdb23309e546d124e834034f511ed7fee71c064c5
MD5 a4b838bfc58482b86239c4cb2d5aa338
BLAKE2b-256 c44f1e2c23d31286ef4410dc7e3ccfbe07340e4e258deb04f32035682f32b09e

See more details on using hashes here.

File details

Details for the file explore_cli-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: explore_cli-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.8 Darwin/23.5.0

File hashes

Hashes for explore_cli-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 0f2495a6c21f06b34d3fca252d3c5544bc0bddc5eca12f1a07aa4b6ce673670a
MD5 5641a6043f0fdd9a309bafd5df305fda
BLAKE2b-256 41d9a571fed3fb37c540488cb040720ae76725203c6553855833e5e9162bcc1a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page