Skip to main content

Local semantic code recall with mxbai embeddings and SQLite

Project description

RAGrep

RAGrep is a dead-simple local semantic recall tool for code and text files.

It uses:

  • mxbai-embed-large embeddings in-process (no server)
  • a single local SQLite database file: .ragrep.db

No ChromaDB. No remote API keys.

Install

pip install ragrep

Embedding Model Storage

RAGrep downloads the embedding model automatically on first use.

Default model directories:

  • Linux: ~/.config/ragrep/models
  • macOS: ~/Library/Application Support/ragrep/models
  • Windows: %APPDATA%\\ragrep\\models

Override with:

  • env var: RAGREP_MODEL_DIR
  • CLI flag: --model-dir
  • Python API: RAGrep(model_dir="...")

GPU Usage

RAGrep can use GPU for embeddings when available.

  • Default behavior: RAGREP_DEVICE=auto (prefers cuda, then mps, then cpu)
  • Override via env: RAGREP_DEVICE=cpu|cuda|mps|cuda:0
  • Override via CLI: --device ...
  • Override via Python API: RAGrep(embedding_device="...")
  • Note: GPU usage requires a GPU-capable PyTorch build in your environment.
  • Check runtime GPU support: ragrep --check-gpu (or ragrep --check-gpu --json)

CLI Usage

Recall is the default command.

# Implied recall (auto-indexes when needed)
ragrep "authentication middleware"

# Explicit recall (same behavior)
ragrep recall "authentication middleware"

# Build/update index manually
ragrep index .

# Show stats
ragrep stats

# Stats alias
ragrep --stats

When --path is omitted, auto-indexing uses the previously indexed root if one exists; otherwise it uses the current directory.

Indexing is incremental by default:

  • new files are added
  • modified files are re-embedded
  • removed files are deleted from the index

CLI runs that index files print newly indexed file paths to stdout.

Useful flags:

ragrep "query text" --path . --limit 10 --db-path ./.ragrep.db
ragrep "query text" --model-dir ~/.config/ragrep/models --json
ragrep "query text" --device auto
ragrep index . --force

Python Usage

from ragrep import RAGrep

rag = RAGrep(
    db_path="./.ragrep.db",
    embedding_model="mxbai-embed-large",
    embedding_device="auto",
)
rag.index(".")

result = rag.recall("database transactions", limit=5)
for match in result["matches"]:
    print(match["score"], match["metadata"]["source"])

print(rag.stats())
rag.close()

Library methods:

  • index(path=".", force=False)
  • recall(query, limit=20, path=".", auto_index=True)
  • stats()

Backwards-compatible aliases still available:

  • RAGSystem (alias of RAGrep)
  • dump(...) (alias of recall(..., auto_index=False) result list)

Local Database

RAGrep stores everything in one SQLite file (default ./.ragrep.db):

  • indexed files and mtimes
  • chunked source text
  • embedding vectors
  • index metadata (model, chunk settings, root path)

Development

pip install -e .[dev]
python -m unittest discover -s tests -p 'test_*.py'
python -m build
twine check dist/*

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragrep-0.2.1.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragrep-0.2.1-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file ragrep-0.2.1.tar.gz.

File metadata

  • Download URL: ragrep-0.2.1.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragrep-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e142752170e7bf5ea94b84bdc5714cb4cbe0d4583c2f2fa48410b494747dcda0
MD5 3368cd415ffa76058411c6c317c0eeb2
BLAKE2b-256 fbe0301433bb255d80f3e382b391940f8804d307f82fb68b70cf2d9e04cf93e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragrep-0.2.1.tar.gz:

Publisher: publish.yml on pierce403/ragrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ragrep-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ragrep-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragrep-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d43524bcf0f014f390112d5c18527a034fced17f14c89cd22b5211e776b68bb1
MD5 005e298d0797771e593c16775a9150c6
BLAKE2b-256 cd4da1337d718156b3c941abcdfcdb4eb5ab0d7ef1bb31ba28af9311f2170600

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragrep-0.2.1-py3-none-any.whl:

Publisher: publish.yml on pierce403/ragrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page