Local semantic code recall with mxbai embeddings and SQLite
Project description
RAGrep
RAGrep is a dead-simple local semantic recall tool for code and text files.
It uses:
mxbai-embed-largeembeddings in-process (no server)- a single local SQLite database file:
.ragrep.db
No ChromaDB. No remote API keys.
Install
pip install ragrep
Embedding Model Storage
RAGrep downloads the embedding model automatically on first use.
Default model directories:
- Linux:
~/.config/ragrep/models - macOS:
~/Library/Application Support/ragrep/models - Windows:
%APPDATA%\\ragrep\\models
Override with:
- env var:
RAGREP_MODEL_DIR - CLI flag:
--model-dir - Python API:
RAGrep(model_dir="...")
GPU Usage
RAGrep can use GPU for embeddings when available.
- Default behavior:
RAGREP_DEVICE=auto(preferscuda, thenmps, thencpu) - Override via env:
RAGREP_DEVICE=cpu|cuda|mps|cuda:0 - Override via CLI:
--device ... - Override via Python API:
RAGrep(embedding_device="...") - Note: GPU usage requires a GPU-capable PyTorch build in your environment.
- Check runtime GPU support:
ragrep --check-gpu(orragrep --check-gpu --json)
CLI Usage
Recall is the default command.
# Implied recall (auto-indexes when needed)
ragrep "authentication middleware"
# Explicit recall (same behavior)
ragrep recall "authentication middleware"
# Build/update index manually
ragrep index .
# Show stats
ragrep stats
# Stats alias
ragrep --stats
When --path is omitted, auto-indexing uses the previously indexed root if one exists;
otherwise it uses the current directory.
Indexing is incremental by default:
- new files are added
- modified files are re-embedded
- removed files are deleted from the index
CLI runs that index files print newly indexed file paths to stdout.
Useful flags:
ragrep "query text" --path . --limit 10 --db-path ./.ragrep.db
ragrep "query text" --model-dir ~/.config/ragrep/models --json
ragrep "query text" --device auto
ragrep index . --force
Python Usage
from ragrep import RAGrep
rag = RAGrep(
db_path="./.ragrep.db",
embedding_model="mxbai-embed-large",
embedding_device="auto",
)
rag.index(".")
result = rag.recall("database transactions", limit=5)
for match in result["matches"]:
print(match["score"], match["metadata"]["source"])
print(rag.stats())
rag.close()
Library methods:
index(path=".", force=False)recall(query, limit=20, path=".", auto_index=True)stats()
Backwards-compatible aliases still available:
RAGSystem(alias ofRAGrep)dump(...)(alias ofrecall(..., auto_index=False)result list)
Local Database
RAGrep stores everything in one SQLite file (default ./.ragrep.db):
- indexed files and mtimes
- chunked source text
- embedding vectors
- index metadata (model, chunk settings, root path)
Development
pip install -e .[dev]
python -m unittest discover -s tests -p 'test_*.py'
python -m build
twine check dist/*
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragrep-0.2.3.tar.gz.
File metadata
- Download URL: ragrep-0.2.3.tar.gz
- Upload date:
- Size: 24.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaac67101d542e585aab9e7fe652dad8f6c97d5585fb16002f2d9070b1885e1c
|
|
| MD5 |
9335b041990bbb9c7dc0d7c7424c26bd
|
|
| BLAKE2b-256 |
709f0c386419b6c9aec7df6089a8fed9035b726b60ff3d040e2840c7fb3e9039
|
Provenance
The following attestation bundles were made for ragrep-0.2.3.tar.gz:
Publisher:
publish.yml on pierce403/ragrep
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ragrep-0.2.3.tar.gz -
Subject digest:
aaac67101d542e585aab9e7fe652dad8f6c97d5585fb16002f2d9070b1885e1c - Sigstore transparency entry: 1092381161
- Sigstore integration time:
-
Permalink:
pierce403/ragrep@9d390c8fd3e80f15a89692e2c7ab700c3e23a458 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/pierce403
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d390c8fd3e80f15a89692e2c7ab700c3e23a458 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ragrep-0.2.3-py3-none-any.whl.
File metadata
- Download URL: ragrep-0.2.3-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1935ed15c340d6cd142a13b7bb8326233f7bae6760bc8c7a374c860bcc5a37aa
|
|
| MD5 |
f0c113ff020fb609819d1f71cf9dc86c
|
|
| BLAKE2b-256 |
24bc5546659aa6f7da3c3b6a9bafa915403567dec18df646190129a70eab32bc
|
Provenance
The following attestation bundles were made for ragrep-0.2.3-py3-none-any.whl:
Publisher:
publish.yml on pierce403/ragrep
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ragrep-0.2.3-py3-none-any.whl -
Subject digest:
1935ed15c340d6cd142a13b7bb8326233f7bae6760bc8c7a374c860bcc5a37aa - Sigstore transparency entry: 1092381165
- Sigstore integration time:
-
Permalink:
pierce403/ragrep@9d390c8fd3e80f15a89692e2c7ab700c3e23a458 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/pierce403
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d390c8fd3e80f15a89692e2c7ab700c3e23a458 -
Trigger Event:
release
-
Statement type: