RAGLens CLI for debugging retrieval behavior in RAG systems
Project description
RAGLens
RAGLens is a CLI to debug retrieval behavior in RAG systems.
MVP loop:
explainone bad querysimulatemany queriesfixsuggests the first change to try
Scope: retrieval diagnostics only.
Not answer grading, hallucination detection, prompt eval, or agent tracing.
Quick Start
# from repo root
cargo run -- explain inputs/docs --query "refund after 90 days"
cargo run -- simulate inputs/docs --queries inputs/queries.txt
cargo run -- fix inputs/docs --queries inputs/queries.txt
Use a richer sample corpus:
cargo run -- explain inputs/examples/ecommerce/docs --query "refund after 90 days"
cargo run -- simulate inputs/examples/ecommerce/docs --queries inputs/examples/ecommerce/queries.txt
cargo run -- fix inputs/examples/ecommerce/docs --queries inputs/examples/ecommerce/queries.txt
Install locally:
cargo install --path .
raglens --help
Install with pip (no Rust toolchain required once wheels are published):
pip install raglens-cli
raglens --help
Primary Commands
explain
Explain why top documents/chunks ranked for a single query.
raglens explain ./docs --query "refund after 90 days"
Outputs:
- top-ranked chunks/docs
- score breakdown (semantic + lexical components)
- quick signal for why rank #1 won
Optional artifacts:
raglens explain ./docs --query "refund after 90 days" \
--json-out artifacts/explain.json \
--html-out artifacts/explain.html
simulate
Simulate retrieval over a query set.
raglens simulate ./docs --queries ./queries.txt
Outputs:
- top-1 document frequency
- low-similarity query count
- no-match query count
- dominant-document warning
fix
Rules-based diagnostic advisor.
It does not mutate files or auto-run agents.
raglens fix ./docs --queries ./queries.txt
Outputs:
- detected issue
- likely causes
- first fix to try
- rerun command
Example:
Issue: refund_policy.md dominates 48% of top-1 results
Likely causes:
- chunk size too large for mixed-topic content
- duplicate/repeated chunk language boosts one document
Try first: reduce chunk_size from 400 to 200
Then rerun: raglens simulate <docs> --queries queries.txt
Inputs
Recommended MVP inputs:
- docs:
.md,.txt - queries: plain text, one query per line
Supported (advanced) query formats:
- YAML with
queries: - tab-separated:
id<TAB>query<TAB>expect_doc1,expect_doc2 - plain text query files can include blank lines and
# commentlines (ignored)
Deterministic by Default
- default embedder: local deterministic null embedder
- deterministic chunking and ranking pipeline
- consistent outputs for same corpus + queries + config
Artifacts
All commands support --json-out.
explain also supports --html-out.
You can also use --artifacts-dir to write standard report files.
Real-World Use
Run on your own corpus:
raglens simulate ./docs --queries ./queries.txt --artifacts-dir ./artifacts
raglens fix ./docs --queries ./queries.txt
If you want a simple wrapper:
scripts/run-audit.sh ./docs ./queries.txt ./artifacts
Use real web docs as input (optional):
scripts/import-web-docs.sh ./inputs/public_urls.txt ./inputs/docs_web
cargo run -- simulate ./inputs/docs_web --queries ./inputs/queries.txt
Notes:
- imported files are saved as plain
.txtwith aSource:header - imported pages that are mostly one long line are still split safely (sentence/token-based) during chunking
- keep only pages you are allowed to store/use in your environment
Advanced / Experimental
RAGLens includes additional advanced commands for deeper workflows (comparison, optimization, etc.). They are intentionally hidden from default help to keep the MVP interface focused.
Experimental deterministic answer checker (CSV truth layer):
raglens answer-audit \
--data ./inputs/examples/answer_audit/sales.csv \
--group-by region,channel \
--metric revenue \
--period-col period \
--baseline old \
--current new \
--question "Why did revenue increase?" \
--answer "Revenue increased due to EU growth"
Unknown dataset quick start (auto infer schema):
raglens answer-audit \
--data ./my_data.csv \
--auto \
--answer "Revenue increased because EU grew"
--auto infers:
- metric column
- period column
- baseline/current period values
- group-by columns
Optional period bucketing:
raglens answer-audit \
--data ./my_data.csv \
--auto \
--period-granularity month \
--answer "Revenue increased because EU grew"
--period-granularity raw|month|week(defaultraw)month/weekrequire parseable date-like period values
More answer-audit examples:
# expected verdict: SUPPORTED
raglens answer-audit \
--data ./inputs/examples/answer_audit/sales_supported.csv \
--group-by region,channel \
--metric revenue \
--period-col period \
--baseline old \
--current new \
--answer "Revenue increased due to strong US Direct growth"
# expected verdict: RISKY (mentions weak contributor)
raglens answer-audit \
--data ./inputs/examples/answer_audit/sales_risky.csv \
--group-by region,channel \
--metric revenue \
--period-col period \
--baseline old \
--current new \
--answer "Revenue increased due to US Direct and LATAM growth"
Non-Goals
- Full RAG framework
- Answer quality evaluator
- Hallucination detector
- Autonomous tuning agent
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raglens_cli-0.1.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: raglens_cli-0.1.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 6.8 MB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5fde5a07afd79318ee8f92dcd1b78954140b122dff0a91256d745514a5cdb3ef
|
|
| MD5 |
142c9b69903ffbb01b8ff4c7eac5c23d
|
|
| BLAKE2b-256 |
8c6c3cdc616fad3d302a88c95dd7069446b41dba0b00eccca46ac6cc78bb1c1a
|
Provenance
The following attestation bundles were made for raglens_cli-0.1.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on kraftaa/raglens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raglens_cli-0.1.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
5fde5a07afd79318ee8f92dcd1b78954140b122dff0a91256d745514a5cdb3ef - Sigstore transparency entry: 1258559113
- Sigstore integration time:
-
Permalink:
kraftaa/raglens@a588e29790345f340c8c4024d64f5cd4d40592cd -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a588e29790345f340c8c4024d64f5cd4d40592cd -
Trigger Event:
push
-
Statement type:
File details
Details for the file raglens_cli-0.1.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: raglens_cli-0.1.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 6.4 MB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18ca8741f623e13cbcea43aafcb7a0891cd861d82924527e132ba60d93e0ce7c
|
|
| MD5 |
bd61a211453495cba5588c1542652cb2
|
|
| BLAKE2b-256 |
16b7dab12baf87277fa163240f1a4e5e399518cad20ca68f051fcf828d566af0
|
Provenance
The following attestation bundles were made for raglens_cli-0.1.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
release.yml on kraftaa/raglens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raglens_cli-0.1.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
18ca8741f623e13cbcea43aafcb7a0891cd861d82924527e132ba60d93e0ce7c - Sigstore transparency entry: 1258559086
- Sigstore integration time:
-
Permalink:
kraftaa/raglens@a588e29790345f340c8c4024d64f5cd4d40592cd -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a588e29790345f340c8c4024d64f5cd4d40592cd -
Trigger Event:
push
-
Statement type:
File details
Details for the file raglens_cli-0.1.2-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: raglens_cli-0.1.2-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 6.3 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
045b56edda53ffce2eaeda6cc86330fcacfdfc17a1253c55fee9ebab991f0244
|
|
| MD5 |
78ece049e7f2e025313f5506900c5b34
|
|
| BLAKE2b-256 |
9aa22e2221b4bc87ef99dd4f74b132ffcf14c30a462ef54c031466f2ffaf349a
|
Provenance
The following attestation bundles were made for raglens_cli-0.1.2-py3-none-macosx_11_0_arm64.whl:
Publisher:
release.yml on kraftaa/raglens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raglens_cli-0.1.2-py3-none-macosx_11_0_arm64.whl -
Subject digest:
045b56edda53ffce2eaeda6cc86330fcacfdfc17a1253c55fee9ebab991f0244 - Sigstore transparency entry: 1258559103
- Sigstore integration time:
-
Permalink:
kraftaa/raglens@a588e29790345f340c8c4024d64f5cd4d40592cd -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a588e29790345f340c8c4024d64f5cd4d40592cd -
Trigger Event:
push
-
Statement type:
File details
Details for the file raglens_cli-0.1.2-py3-none-macosx_10_12_x86_64.whl.
File metadata
- Download URL: raglens_cli-0.1.2-py3-none-macosx_10_12_x86_64.whl
- Upload date:
- Size: 6.6 MB
- Tags: Python 3, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9147ece3c7c2ea219ef1396c8947e1249b69c1dfe9584a6c0b62ae6149c5a13b
|
|
| MD5 |
b843fdbedd45d9e1fafa5cd936c5b2f1
|
|
| BLAKE2b-256 |
7f6c7870dbae64363e516d6115e56ac0d3cab1771411723bc22476570b299ee1
|
Provenance
The following attestation bundles were made for raglens_cli-0.1.2-py3-none-macosx_10_12_x86_64.whl:
Publisher:
release.yml on kraftaa/raglens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raglens_cli-0.1.2-py3-none-macosx_10_12_x86_64.whl -
Subject digest:
9147ece3c7c2ea219ef1396c8947e1249b69c1dfe9584a6c0b62ae6149c5a13b - Sigstore transparency entry: 1258559076
- Sigstore integration time:
-
Permalink:
kraftaa/raglens@a588e29790345f340c8c4024d64f5cd4d40592cd -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a588e29790345f340c8c4024d64f5cd4d40592cd -
Trigger Event:
push
-
Statement type: