Skip to main content

Local semantic code recall with mxbai embeddings and SQLite

Project description

RAGrep

RAGrep is a dead-simple local semantic recall tool for code and text files.

It uses:

  • mxbai-embed-large embeddings in-process (no server)
  • a single local SQLite database file: .ragrep.db

No ChromaDB. No remote API keys.

Install

pip install ragrep

Embedding Model Storage

RAGrep downloads the embedding model automatically on first use.

Default model directories:

  • Linux: ~/.config/ragrep/models
  • macOS: ~/Library/Application Support/ragrep/models
  • Windows: %APPDATA%\\ragrep\\models

Override with:

  • env var: RAGREP_MODEL_DIR
  • CLI flag: --model-dir
  • Python API: RAGrep(model_dir="...")

GPU Usage

RAGrep can use GPU for embeddings when available.

  • Default behavior: RAGREP_DEVICE=auto (prefers cuda, then mps, then cpu)
  • Override via env: RAGREP_DEVICE=cpu|cuda|mps|cuda:0
  • Override via CLI: --device ...
  • Override via Python API: RAGrep(embedding_device="...")
  • Note: GPU usage requires a GPU-capable PyTorch build in your environment.
  • Check runtime GPU support: ragrep --check-gpu (or ragrep --check-gpu --json)

CLI Usage

Recall is the default command.

# Implied recall (auto-indexes when needed)
ragrep "authentication middleware"

# Explicit recall (same behavior)
ragrep recall "authentication middleware"

# Build/update index manually
ragrep index .

# Show stats
ragrep stats

# Stats alias
ragrep --stats

When --path is omitted, auto-indexing uses the previously indexed root if one exists; otherwise it uses the current directory.

Indexing is incremental by default:

  • new files are added
  • modified files are re-embedded
  • removed files are deleted from the index

CLI runs that index files print newly indexed file paths to stdout.

Useful flags:

ragrep "query text" --path . --limit 10 --db-path ./.ragrep.db
ragrep "query text" --model-dir ~/.config/ragrep/models --json
ragrep "query text" --device auto
ragrep index . --force

Python Usage

from ragrep import RAGrep

rag = RAGrep(
    db_path="./.ragrep.db",
    embedding_model="mxbai-embed-large",
    embedding_device="auto",
)
rag.index(".")

result = rag.recall("database transactions", limit=5)
for match in result["matches"]:
    print(match["score"], match["metadata"]["source"])

print(rag.stats())
rag.close()

Library methods:

  • index(path=".", force=False)
  • recall(query, limit=20, path=".", auto_index=True)
  • stats()

Backwards-compatible aliases still available:

  • RAGSystem (alias of RAGrep)
  • dump(...) (alias of recall(..., auto_index=False) result list)

Local Database

RAGrep stores everything in one SQLite file (default ./.ragrep.db):

  • indexed files and mtimes
  • chunked source text
  • embedding vectors
  • index metadata (model, chunk settings, root path)

Development

pip install -e .[dev]
python -m unittest discover -s tests -p 'test_*.py'
python -m build
twine check dist/*

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragrep-0.2.2.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragrep-0.2.2-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file ragrep-0.2.2.tar.gz.

File metadata

  • Download URL: ragrep-0.2.2.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragrep-0.2.2.tar.gz
Algorithm Hash digest
SHA256 3fdbe2b24aa97cd174bec977784ecd28d366426932bc3ca783dcc036cd2cc3ba
MD5 e31a873072995a823cecfa37574853e5
BLAKE2b-256 980f0e756a02872f3434d4931bacb8d11f8d7514864b1e37034ef9dd73f69caa

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragrep-0.2.2.tar.gz:

Publisher: publish.yml on pierce403/ragrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ragrep-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: ragrep-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragrep-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f2900811c781cfd85b59091e06e27ee8906968b24d69583e43e702d02b01f2ec
MD5 b03e4b78b88bf6a29c50304b32913d2b
BLAKE2b-256 c8ab34ce2f057d3f7323f87aeaa54fe4f74a2c6f71e0d5b660372e498626b71a

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragrep-0.2.2-py3-none-any.whl:

Publisher: publish.yml on pierce403/ragrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page