AI-powered code retrieval — index any GitHub repo by URL, search with natural language, get token-budget-aware context packs for LLMs

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aayush17

These details have not been verified by PyPI

Project description

RepoMemory

AI-powered code retrieval engine — index any GitHub repo, search with natural language

Point it at any GitHub URL. Get token-budget-aware context packs — ready to paste into any LLM. Free to run, free to deploy.

What is RepoMemory?

When you ask an LLM to fix a bug or trace a feature, it needs the right source files. Pasting the whole codebase wastes the context window. Guessing which files to include misses critical pieces.

RepoMemory solves this with a hybrid retrieval pipeline that runs on any public or private GitHub repo:

repomemory index https://github.com/pallets/flask
repomemory search "Where is request routing handled?"

Query  →  Task Classification (trace_flow)
       →  BM25 Lexical + FAISS Semantic + Fuzzy Path + Symbol search  (parallel)
       →  Reciprocal Rank Fusion  →  Top-20 ranked files
       →  Dependency-graph expansion  →  Re-rank with adaptive weights
       →  Token-budget packer  →  Context pack ready to paste into any LLM
       →  (optional) Groq AI summary

Install

# Core — CLI + library, uses HuggingFace API for embeddings (no GPU needed)
pip install repomemory

# With local embeddings (~80 MB model download, fully offline)
pip install "repomemory[local]"

# With FastAPI web server
pip install "repomemory[server]"

# With Groq LLM explanations (free API key)
pip install "repomemory[llm]"

# Everything
pip install "repomemory[all]"

Quick Start

CLI

# Index a repo
repomemory index https://github.com/pallets/flask

# Search it
repomemory search "How does request routing work?"

# With AI explanations (free Groq key)
export REPOMEMORY_GROQ_API_KEY=gsk_...
repomemory search "Where is token rotation handled?"

# Adjust result count and token budget
repomemory search "auth middleware" --top-k 10 --budget 4000

# Force a task mode
repomemory search "test coverage for auth" --mode test_lookup

# Private repos
repomemory index https://github.com/myorg/private-repo --token ghp_...

# List indexed repos
repomemory list

# Start the web UI + API server
repomemory serve

Python library

from repomemory import RepoMemory

rm = RepoMemory()

# Index
repo = rm.index("https://github.com/pallets/flask")

# Search
result = rm.search("How does request dispatching work?")
for file in result.context_pack.files:
    print(f"{file.path}  score={file.relevance_score:.2f}")
    print(file.snippets[0].content[:300])

REST API

# Start the server
pip install "repomemory[server]"
repomemory serve

# Index a repo
curl -X POST http://localhost:8000/api/repos \
  -H "Content-Type: application/json" \
  -d '{"url": "https://github.com/pallets/flask"}'

# Search
curl -X POST http://localhost:8000/api/search \
  -H "Content-Type: application/json" \
  -d '{"repo_id": 1, "query": "How does routing work?", "token_budget": 8000}'

Features

Feature	Details
Hybrid search	BM25 lexical + FAISS semantic (384-dim) + fuzzy path + symbol name, fused with Reciprocal Rank Fusion
Dependency-graph retrieval	Builds file-level import edges at index time; expands results through related files via BFS
Adaptive weight learning	Online SGD learning from user feedback (accept / dismiss / thumbs); falls back to static mode weights
Symbol-aware indexing	tree-sitter extracts functions, classes, and methods from Python, JavaScript, and TypeScript
5 Task Modes	`bug_fix`, `trace_flow`, `test_lookup`, `config_lookup`, `general` — auto-detected from query or set manually
Token-budget packer	Greedy packer respects any token limit (default 8 000 tokens, configurable up to 100 000+)
Behavioral memory	Frecency scoring from opened / accepted / thumbs-up actions; boosts relevant files in future queries
RAG evaluation	End-to-end pipeline scoring retrieval impact on LLM answer quality (relevance, completeness, faithfulness)
Flexible embeddings	Local `sentence-transformers` (offline) or free HuggingFace Inference API
AI explanations	Optional Groq LLM (free tier) explains why each result matters
Incremental indexing	SHA-256 per file; only changed files are re-embedded on re-index
Web UI + REST API	React 19 frontend + FastAPI backend; deploy on Render + Vercel (both free)
Export as Markdown	Copy context pack as a formatted Markdown block to paste directly into an LLM prompt

Task Modes

RepoMemory classifies each query and adjusts retrieval weights automatically:

Mode	Auto-detected from	What it boosts
`bug_fix`	`error`, `exception`, `crash`, `fix`, `traceback`	Lexical signal, error-adjacent files
`trace_flow`	`trace`, `flow`, `route`, `handler`, `how does...work`	Symbol matching, call-chain ordering
`test_lookup`	`test`, `spec`, `mock`, `fixture`, `coverage`	Path matching for `tests/` / `spec/` dirs
`config_lookup`	`config`, `env`, `setting`, `yaml`, `toml`	Path matching for config-like files
`general`	(fallback)	Balanced across all signals

Configuration

All settings use the REPOMEMORY_ env prefix (powered by pydantic-settings):

export REPOMEMORY_HF_API_KEY=hf_...          # HuggingFace free API key
export REPOMEMORY_GROQ_API_KEY=gsk_...       # Groq free API key (for AI summaries)
export REPOMEMORY_EMBEDDING_PROVIDER=local   # 'local' or 'huggingface'
export REPOMEMORY_DATA_DIR=/data/repomemory  # where SQLite + FAISS live
export REPOMEMORY_TOKEN_BUDGET=16000         # default context pack size

License

MIT © Aayush Kumar

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aayush17

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.3

Apr 17, 2026

0.2.2

Apr 17, 2026

This version

0.2.1

Apr 17, 2026

0.2.0

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repomemory-0.2.1.tar.gz (55.0 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

repomemory-0.2.1-py3-none-any.whl (58.7 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file repomemory-0.2.1.tar.gz.

File metadata

Download URL: repomemory-0.2.1.tar.gz
Upload date: Apr 17, 2026
Size: 55.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repomemory-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`cb238bdfbc605e3ada943b913d00f7273ffe8901e5ec96bdf8e0d6d139b05888`
MD5	`560976ca54cc002f293322b35fccebb7`
BLAKE2b-256	`4a1fd2315cd6c7fd16a8a9de3caacfed9629935bd9bd723934f28e245db7afc3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for repomemory-0.2.1.tar.gz:

Publisher: publish.yml on aayushakumar/RepoMemory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: repomemory-0.2.1.tar.gz
- Subject digest: cb238bdfbc605e3ada943b913d00f7273ffe8901e5ec96bdf8e0d6d139b05888
- Sigstore transparency entry: 1324363913
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: aayushakumar/RepoMemory@33e76aae1299186cdf2f159a5b78a96373e43fa3
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/aayushakumar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@33e76aae1299186cdf2f159a5b78a96373e43fa3
- Trigger Event: release

File details

Details for the file repomemory-0.2.1-py3-none-any.whl.

File metadata

Download URL: repomemory-0.2.1-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 58.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repomemory-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a8daf35982f8caacec88a5a009f439ef37dd79561ab9ef8a11da73d2f54c333c`
MD5	`cfa5d563909c84094b996e78483dba98`
BLAKE2b-256	`6834d085c004e0c5460220ce0fd43398517d67edcb67e9c3ecb73c051a3103dc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for repomemory-0.2.1-py3-none-any.whl:

Publisher: publish.yml on aayushakumar/RepoMemory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: repomemory-0.2.1-py3-none-any.whl
- Subject digest: a8daf35982f8caacec88a5a009f439ef37dd79561ab9ef8a11da73d2f54c333c
- Sigstore transparency entry: 1324364003
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: aayushakumar/RepoMemory@33e76aae1299186cdf2f159a5b78a96373e43fa3
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/aayushakumar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@33e76aae1299186cdf2f159a5b78a96373e43fa3
- Trigger Event: release

repomemory 0.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

RepoMemory

What is RepoMemory?

Install

Quick Start

CLI

Python library

REST API

Features

Task Modes

Configuration

Links

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance