Skip to main content

RAGLens CLI for debugging retrieval behavior in RAG systems

Project description

RAGLens

RAGLens is a CLI to debug retrieval behavior in RAG systems.

MVP loop:

  1. explain one bad query
  2. simulate many queries
  3. fix suggests the first change to try

Scope: retrieval diagnostics only.
Not answer grading, hallucination detection, prompt eval, or agent tracing.

Quick Start

# from repo root
cargo run -- explain inputs/docs --query "refund after 90 days"
cargo run -- simulate inputs/docs --queries inputs/queries.txt
cargo run -- fix inputs/docs --queries inputs/queries.txt

Use a richer sample corpus:

cargo run -- explain inputs/examples/ecommerce/docs --query "refund after 90 days"
cargo run -- simulate inputs/examples/ecommerce/docs --queries inputs/examples/ecommerce/queries.txt
cargo run -- fix inputs/examples/ecommerce/docs --queries inputs/examples/ecommerce/queries.txt

Install locally:

cargo install --path .
raglens --help

Install with pip (no Rust toolchain required once wheels are published):

pip install raglens-cli
raglens --help

Primary Commands

explain

Explain why top documents/chunks ranked for a single query.

raglens explain ./docs --query "refund after 90 days"

Outputs:

  • top-ranked chunks/docs
  • score breakdown (semantic + lexical components)
  • quick signal for why rank #1 won

Optional artifacts:

raglens explain ./docs --query "refund after 90 days" \
  --json-out artifacts/explain.json \
  --html-out artifacts/explain.html

simulate

Simulate retrieval over a query set.

raglens simulate ./docs --queries ./queries.txt

Outputs:

  • top-1 document frequency
  • low-similarity query count
  • no-match query count
  • dominant-document warning

fix

Rules-based diagnostic advisor.
It does not mutate files or auto-run agents.

raglens fix ./docs --queries ./queries.txt

Outputs:

  • detected issue
  • likely causes
  • first fix to try
  • rerun command

Example:

Issue: refund_policy.md dominates 48% of top-1 results

Likely causes:
- chunk size too large for mixed-topic content
- duplicate/repeated chunk language boosts one document

Try first: reduce chunk_size from 400 to 200
Then rerun: raglens simulate <docs> --queries queries.txt

Inputs

Recommended MVP inputs:

  • docs: .md, .txt
  • queries: plain text, one query per line

Supported (advanced) query formats:

  • YAML with queries:
  • tab-separated: id<TAB>query<TAB>expect_doc1,expect_doc2
  • plain text query files can include blank lines and # comment lines (ignored)

Deterministic by Default

  • default embedder: local deterministic null embedder
  • deterministic chunking and ranking pipeline
  • consistent outputs for same corpus + queries + config

Artifacts

All commands support --json-out. explain also supports --html-out.

You can also use --artifacts-dir to write standard report files.

Real-World Use

Run on your own corpus:

raglens simulate ./docs --queries ./queries.txt --artifacts-dir ./artifacts
raglens fix ./docs --queries ./queries.txt

If you want a simple wrapper:

scripts/run-audit.sh ./docs ./queries.txt ./artifacts

Use real web docs as input (optional):

scripts/import-web-docs.sh ./inputs/public_urls.txt ./inputs/docs_web
cargo run -- simulate ./inputs/docs_web --queries ./inputs/queries.txt

Notes:

  • imported files are saved as plain .txt with a Source: header
  • imported pages that are mostly one long line are still split safely (sentence/token-based) during chunking
  • keep only pages you are allowed to store/use in your environment

Advanced / Experimental

RAGLens includes additional advanced commands for deeper workflows (comparison, optimization, etc.). They are intentionally hidden from default help to keep the MVP interface focused.

Non-Goals

  • Full RAG framework
  • Answer quality evaluator
  • Hallucination detector
  • Autonomous tuning agent

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

raglens_cli-0.1.1-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

raglens_cli-0.1.1-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.2 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

raglens_cli-0.1.1-py3-none-macosx_11_0_arm64.whl (6.1 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

raglens_cli-0.1.1-py3-none-macosx_10_12_x86_64.whl (6.4 MB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file raglens_cli-0.1.1-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for raglens_cli-0.1.1-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5d5d79b4056b45bc53c43146daf0308c6d4ea29b3e38e9251c675e352fa1c9fc
MD5 98bf4924eccad9b8a07bc20bf9a8e7b5
BLAKE2b-256 41e1fc7f8533c9b4d044324307e5a0173c9c02c3bfe2390d535104c9747cd97d

See more details on using hashes here.

Provenance

The following attestation bundles were made for raglens_cli-0.1.1-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on kraftaa/raglens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file raglens_cli-0.1.1-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for raglens_cli-0.1.1-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 845de34f0132009e946aed6028043021ea222c37c9326f6b64ff2f03c7c7c943
MD5 7804186e082ca1bb0e6fa4447036d62d
BLAKE2b-256 3b1b6612a6372d695a53aa6c844c976f3b35f03af1a2b4bfcef70bdf43c07d91

See more details on using hashes here.

Provenance

The following attestation bundles were made for raglens_cli-0.1.1-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on kraftaa/raglens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file raglens_cli-0.1.1-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for raglens_cli-0.1.1-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 da43656bc16add6fc05c5c1b1b3c7bb88f2784ce9022db341137e7c29413f0c2
MD5 077beeed4cfb21d6b0f0c11e53ed4adf
BLAKE2b-256 bbb5b33522479dbe05346a0128629c250205ec70a804792832e07fb90d513b9e

See more details on using hashes here.

Provenance

The following attestation bundles were made for raglens_cli-0.1.1-py3-none-macosx_11_0_arm64.whl:

Publisher: release.yml on kraftaa/raglens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file raglens_cli-0.1.1-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for raglens_cli-0.1.1-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c00f46f66cc7abc5516c3fe127098f590a89ed922faa15f0154e658346393049
MD5 9e66ade3dcf21b00e86cd9eb97bf6f9c
BLAKE2b-256 8f42879372c93955e1bf047cf58e3081db9fdccd7eb0df608f3b2b0c47c683bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for raglens_cli-0.1.1-py3-none-macosx_10_12_x86_64.whl:

Publisher: release.yml on kraftaa/raglens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page