Chat with any codebase using AI — local or GitHub repos

These details have not been verified by PyPI

Project description

askmy-codebase

Chat with any codebase using AI — from your terminal or via REST API. Point it at a local folder or a GitHub URL, ask questions in plain English, and get answers grounded in the actual source code.

Key Features

Terminal CLI — askmy-codebase --repo_path . works from any directory after a one-time install
GitHub URL support — clones and indexes any public repo on the fly
Hybrid search — combines FAISS vector search with BM25 keyword search for better retrieval
PR review mode — feed it a .diff file and get a structured code review
CLAUDE.md generator — auto-generate a codebase summary file for any repo
REST API — run it as a FastAPI server for programmatic access
Incremental indexing — only re-embeds files that changed since the last run
No cloud embedding — uses BAAI/bge-small-en-v1.5 locally (free, private)

Tech Stack
Prerequisites
Installation
Quick Start
Usage
Environment Variables
REST API
Deployment on Render
Architecture
Project Structure
Running Tests
Troubleshooting

Tech Stack

Layer	Technology
Language	Python 3.9+
LLM	OpenAI GPT (default: `gpt-4.1-nano-2025-04-14`)
Embeddings	HuggingFace `BAAI/bge-small-en-v1.5` (local)
Vector Store	FAISS (CPU)
Keyword Search	BM25 (rank-bm25)
Code Parsing	tree-sitter (Python, JS)
Orchestration	LangChain
REST API	FastAPI + Uvicorn
Deployment	Docker / Render

Prerequisites

Python 3.9 or higher
An OpenAI API key
Git (for cloning GitHub repos)

Installation

Option 1 — Install from PyPI (recommended)

pip install askmy-codebase

Option 2 — Install from source

git clone https://github.com/Nachiket1904/askmy-codebase.git
cd askmy-codebase
pip install -e .

Save your API key (one time only)

askmy-codebase configure --api-key sk-xxxxx

This saves the key to ~/.config/askmy-codebase/config.json so you never need a .env file. The key is loaded automatically on every run.

Alternative: Set OPENAI_API_KEY as an environment variable or add it to a .env file in your working directory.

Quick Start

# Chat with the current directory
askmy-codebase --repo_path .

# Chat with a GitHub repo (clones automatically)
askmy-codebase --repo_path https://github.com/username/reponame

First run downloads the embedding model (~130 MB) and builds the index. Subsequent runs reuse the index and only re-embed changed files.

Usage

Chat mode (default)

askmy-codebase --repo_path .

[1/4] Indexing codebase from: /your/project
      Index saved to ./index/abc123/
[2/4] Building repository map...
      Mapped 12 file(s)
[3/4] Loading retrieval chain...
      Ready.

[4/4] Starting chat session.
Ask questions about the codebase. Type 'exit' to quit.

> How does authentication work?

The authentication flow uses JWT tokens issued at login...

Sources: src/auth.py, src/middleware.py

> exit
Bye.

PR review mode

# Generate a diff first
git diff main...my-branch > changes.diff

# Review it
askmy-codebase --repo_path . --mode pr-review --diff changes.diff

Outputs a JSON object with a summary, file-by-file feedback, and a risk score.

Generate CLAUDE.md

Creates a structured CLAUDE.md context file for the repo — useful for AI assistants like Claude Code.

askmy-codebase --repo_path . --mode generate-claude-md

All flags

Flag	Default	Description
`--repo_path`	required	Local path or GitHub URL
`--index_path`	`./index`	Where to store/load the FAISS index
`--model`	`gpt-4.1-nano-2025-04-14`	OpenAI chat model to use
`--mode`	`chat`	`chat`, `pr-review`, or `generate-claude-md`
`--diff`	—	Path to `.diff` file (required for `pr-review`)
`--rebuild-index`	false	Force re-index even if index exists

Environment Variables

Variable	Description	Required
`OPENAI_API_KEY`	Your OpenAI API key	Yes (or use `configure`)
`OPENAI_CHAT_MODEL`	Override the default chat model	No
`API_SECRET_KEY`	Secret key for REST API auth	No (dev mode if unset)
`REPO_PATH`	Pre-load a repo at API server startup	No
`INDEX_PATH`	Base directory for FAISS indexes	No (default: `./index`)

REST API

Run as an API server for programmatic access:

uvicorn src.api:app --host 0.0.0.0 --port 8000

Interactive docs available at http://localhost:8000/docs.

Endpoints

`POST /index`

Index a repository (required before querying).

curl -X POST http://localhost:8000/index \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-secret" \
  -d '{"repo_path": "https://github.com/username/repo", "rebuild": false}'

{ "status": "ok", "files_indexed": 24 }

`POST /query`

Ask a question about the indexed codebase.

curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-secret" \
  -d '{"question": "How does the ingestion pipeline work?"}'

{
  "answer": "The ingestion pipeline loads source files using GenericLoader...",
  "sources": ["src/ingestion.py", "src/embedder.py"]
}

`POST /review`

Review a diff string.

curl -X POST http://localhost:8000/review \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-secret" \
  -d '{"diff": "--- a/src/api.py\n+++ b/src/api.py\n..."}'

Deployment on Render

The repo includes a render.yaml and Dockerfile for one-click deployment.

Fork/push this repo to GitHub
Go to render.com → New Web Service → connect your repo
Render detects render.yaml and configures automatically
Set these environment variables in the Render dashboard:
- OPENAI_API_KEY — your OpenAI key
- API_SECRET_KEY — a random secret for API auth
Deploy

Note: The free tier uses ephemeral storage (/tmp/index). The index is rebuilt after each redeploy. For persistence, attach a Render Disk and set INDEX_PATH to a persistent path.

Calling the deployed API

# Index a repo
curl -X POST https://your-app.onrender.com/index \
  -H "X-API-Key: your-secret" \
  -H "Content-Type: application/json" \
  -d '{"repo_path": "https://github.com/username/repo"}'

# Query it
curl -X POST https://your-app.onrender.com/query \
  -H "X-API-Key: your-secret" \
  -H "Content-Type: application/json" \
  -d '{"question": "What does this project do?"}'

Architecture

repo_path (local or GitHub URL)
        │
        ▼
  [github_loader]  ← clones GitHub URLs to temp dir
        │
        ▼
  [ingestion]      ← loads .py/.js/.ts/.java files, respects .claudeignore
        │
        ▼
  [embedder]       ← splits by language, embeds with BAAI/bge-small-en-v1.5
        │           ← caches embeddings on disk (incremental re-indexing)
        ▼
  [FAISS index]  +  [BM25 index]
        │                │
        └──────┬──────────┘
               ▼
         [retriever]      ← hybrid search: vector + keyword, re-ranked
               │
               ▼
         [LangChain chain] ← ConversationalRetrievalChain with GPT
               │
               ▼
           answer + sources

How hybrid retrieval works

For each query, two retrievers run in parallel:

FAISS finds semantically similar chunks (meaning-based)
BM25 finds chunks with matching keywords (exact-match)

Results are merged and de-duplicated. This handles both conceptual questions ("how does auth work?") and exact lookups ("find where load_codebase is called").

Index isolation

Each repo gets its own index directory keyed by a SHA-256 hash of the repo path. Running against two different repos never overwrites each other's index.

./index/
  a3f7c12b4e/   ← hash of /projects/repo-a
  9d2e1f8c03/   ← hash of /projects/repo-b

Project Structure

askmy-codebase/
├── src/
│   ├── main.py                 # CLI entry point, all modes
│   ├── api.py                  # FastAPI REST server
│   ├── ingestion.py            # File loading, .claudeignore support
│   ├── embedder.py             # FAISS index build/save/load, incremental
│   ├── retriever.py            # Hybrid BM25+FAISS retrieval, LangChain chain
│   ├── github_loader.py        # Clone GitHub URLs to temp dir
│   ├── ast_parser.py           # tree-sitter repo map (functions, classes)
│   ├── pr_reviewer.py          # Diff review pipeline
│   ├── claude_md_generator.py  # CLAUDE.md generation
│   └── context_builder.py      # Load/save CLAUDE.md context
├── tests/                      # pytest test suite (21 tests)
├── .github/
│   └── workflows/
│       └── publish.yml         # Auto-publish to PyPI on GitHub release
├── Dockerfile                  # Docker build (pre-downloads embedding model)
├── render.yaml                 # Render deployment config
├── setup.py                    # Package entry point
├── pyproject.toml              # Build metadata
└── requirements.txt            # All dependencies

Running Tests

pip install pytest
pytest tests/ -v

Expected: 21 tests passing.

# Run a specific file
pytest tests/test_ingestion.py -v
pytest tests/test_retriever.py -v

Troubleshooting

`IndexError: list index out of range` on `/index`

The repo has no supported source files (.py, .js, .ts, .java). Check that repo_path points to a repo with code in those languages, or that the GitHub URL is correct and the repo is public.

`OPENAI_API_KEY is not set`

Run askmy-codebase configure --api-key sk-xxxxx or export the variable:

export OPENAI_API_KEY=sk-xxxxx

First run is slow

The embedding model (BAAI/bge-small-en-v1.5, ~130 MB) downloads on first use and is cached in ~/.cache/huggingface/. Subsequent runs are fast.

Render free tier — index lost after redeploy

Free tier uses ephemeral storage. The index rebuilds on every deploy. To persist it, add a Render Disk and set INDEX_PATH=/data/index in your environment variables.

API returns 503 "Index not loaded"

Call POST /index first to build the index before calling /query or /review.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

askmy_codebase-0.1.0.tar.gz (24.4 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

askmy_codebase-0.1.0-py3-none-any.whl (22.6 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file askmy_codebase-0.1.0.tar.gz.

File metadata

Download URL: askmy_codebase-0.1.0.tar.gz
Upload date: May 5, 2026
Size: 24.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for askmy_codebase-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a45db3dc567919cfc3252fa8efa698fe63856eeb8ca975dde91ec96101f05ffb`
MD5	`64e0443f23cb986e27d6d41608cceceb`
BLAKE2b-256	`053a42f0f2a6eaefc51234019ce463b0a76c0a89be8b9e14ce7a581665c3e429`

See more details on using hashes here.

File details

Details for the file askmy_codebase-0.1.0-py3-none-any.whl.

File metadata

Download URL: askmy_codebase-0.1.0-py3-none-any.whl
Upload date: May 5, 2026
Size: 22.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for askmy_codebase-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9c9dbab3f0d6189ac8d2aac0a907fdf802cebd86320a92e66f36a2940d884729`
MD5	`2b7635f2bc907f678da164fa458fe521`
BLAKE2b-256	`603bc44daf5587bdf1999afb10b408a8c575a9f613c4af8a8b9f07e89c5ad2e8`

See more details on using hashes here.

askmy-codebase 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

askmy-codebase

Key Features

Table of Contents

Tech Stack

Prerequisites

Installation

Option 1 — Install from PyPI (recommended)

Option 2 — Install from source

Save your API key (one time only)

Quick Start

Usage

Chat mode (default)

PR review mode

Generate CLAUDE.md

All flags

Environment Variables

REST API

Endpoints

POST /index

POST /query

POST /review

Deployment on Render

Calling the deployed API

Architecture

How hybrid retrieval works

Index isolation

Project Structure

Running Tests

Troubleshooting

IndexError: list index out of range on /index

OPENAI_API_KEY is not set

First run is slow

Render free tier — index lost after redeploy

API returns 503 "Index not loaded"

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`POST /index`

`POST /query`

`POST /review`

`IndexError: list index out of range` on `/index`

`OPENAI_API_KEY is not set`