Vector lineage tracking for RAG pipelines.
Project description
VecTrace
Wrong RAG answer? Trace it to the exact vector, chunk, and source document.
vectrace is a CLI to debug a single wrong RAG answer by tracing it back to its exact source.
Use it when one answer is wrong and you need to find the exact document and chunk that caused it.
Tagline:
vectrace — trace where your RAG answers come from
Start Here
If you only try one thing, run this:
vectrace ask-trace \
--db ./vectrace.db \
--collection support_kb \
--question "Can I get a refund after 90 days?" \
--final-answer "Yes, refunds are allowed." \
--top-k 3 \
--output ./ask-trace.html \
--json-output ./ask-trace.json
This runs retrieval, links results to source data, and generates a trace report.
This gives you:
ask-trace.html(shareable report)ask-trace.json(machine-readable payload)
Output Preview
What You Get
VecTrace links these layers in one place:
- Retrieval context: question, answer, rank, score, metadata
- Retrieval mode:
exact(from your retriever telemetry) orbootstrap(best-effort lexical fallback) - Vector provenance: vector ID, embedding model, run info
- Source evidence: chunk ID/index/snippet and source document path/version
Incident Example
Scenario:
- Question:
Can I get a refund after 90 days? - Answer:
Yes, refunds are allowed. - Policy evidence:
Refunds are only allowed within 30 days for eligible defects.
Run:
vectrace ask-trace \
--db ./vectrace.db \
--collection support_kb \
--question "Can I get a refund after 90 days?" \
--final-answer "Yes, refunds are allowed." \
--top-k 3 \
--output ./incident-refund.html \
--json-output ./incident-refund.json
Check in output:
evidence.support_status(expected:unsupported)evidence.support_reason(shows why)evidence.chunk_text+source_pathfor exact source proof
Core Commands
vectrace ask-trace ...for first-time debugging with no manual vector IDs.vectrace trace ...when you already have a vector ID.vectrace report ...when you want HTML from a known vector ID.
Quickstart
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install setuptools wheel
python3 -m pip install -e . --no-build-isolation
vectrace init --db ./vectrace.db
vectrace onboard --db ./vectrace.db --output ./trace-demo.html
vectrace ask-trace \
--db ./vectrace.db \
--collection support_kb \
--question "Can I get a refund after 90 days?" \
--top-k 3 \
--output ./ask-trace.html \
--json-output ./ask-trace.json
Advanced / Integration
Use these when integrating with your existing retriever/serving pipeline.
Record real retrieval telemetry (record-retrieval)
To attach exact retriever rank/score/vector IDs from your app:
vectrace record-retrieval \
--db ./vectrace.db \
--collection support_kb \
--vector-id vec_101 \
--query-text "Can I get a refund after 90 days?" \
--final-answer "Yes, refunds are allowed." \
--rank 1 \
--score 0.87 \
--evidence-text "Refunds are only allowed within 30 days for eligible defects." \
--metadata-json '{"request_id":"req-123","session_id":"s-1"}'
Bootstrap from question+answer (record-qa)
If retriever telemetry was not logged, create best-effort retrieval events from stored chunk previews:
vectrace record-qa \
--db ./vectrace.db \
--collection support_kb \
--question "Can I get a refund after 90 days?" \
--final-answer "Yes, refunds are allowed." \
--top-k 3
Query recorded events (trace-qa, report-qa)
vectrace trace-qa --db ./vectrace.db --question "Can I get a refund after 90 days?" --answer "Yes, refunds are allowed." --collection support_kb --format json
vectrace report-qa --db ./vectrace.db --question "Can I get a refund after 90 days?" --answer "Yes, refunds are allowed." --collection support_kb --output ./qa-trace.html --json-output ./qa-trace.json
Command Reference
Core
vectrace ask-trace --db ./vectrace.db --collection <name> --question "<query>" [--final-answer "<answer>"] [--top-k 3] [--match-index 1] --output ask-trace.html [--json-output ask-trace.json] [--redact-preview] [--redact-retrieval] [--metadata-json '{"k":"v"}'] [--format text|json]vectrace trace --db ./vectrace.db --vector-id <id> [--collection <name>] [--format text|json] [--plain] [--redact-preview] [--redact-retrieval] [--include-retrieval]vectrace report --db ./vectrace.db --vector-id <id> [--collection <name>] --output trace.html [--redact-preview] [--redact-retrieval] [--include-retrieval]
Advanced
vectrace init --db ./vectrace.dbvectrace onboard --db ./vectrace.db --output trace-demo.htmlvectrace seed-demo --db ./vectrace.db --collection support_kb --vectors 200 --docs 20vectrace connect --qdrant-url http://localhost:6333 --qdrant-collection support_kbvectrace record-retrieval --db ./vectrace.db --collection <name> --vector-id <id> --query-text "<query>" [--final-answer "<answer>"] [--rank <n>] [--score <s>] [--evidence-text "<snippet>"] [--metadata-json '{"k":"v"}']vectrace record-qa --db ./vectrace.db --collection <name> --question "<query>" --final-answer "<answer>" [--top-k 3] [--metadata-json '{"k":"v"}']vectrace trace-qa --db ./vectrace.db --question "<query>" [--answer "<answer>"] [--collection <name>] [--top-k 3] [--format text|json] [--redact-preview] [--redact-retrieval]vectrace report-qa --db ./vectrace.db --question "<query>" [--answer "<answer>"] [--collection <name>] [--top-k 5] [--match-index 1] --output qa-trace.html [--json-output qa-trace.json] [--redact-preview] [--redact-retrieval]
Output Shape (JSON)
trace --format json --include-retrieval and trace-qa --format json include:
retrieval: question, answer, rank, score, metadataretrieval.trace_mode:exactorbootstraptrace/lineage: vector/chunk/document chainevidence: explicit snippet fields (chunk_text,source_path, etc.)
Privacy note:
--redact-retrievalredacts retrieval question/answer fields (including common metadata duplicates) and hides assessment text/details that could echo the original query.
ask-trace --format json emits selected match + query_id in one payload.
Architecture
flowchart LR
A["RAG Ingest Pipeline"] --> B["VecTrace Tracker"]
B --> C["SQLite Trace DB"]
A --> D["Qdrant (vector values)"]
E["vectrace ask-trace/trace/report/trace-qa/report-qa"] --> C
E --> F["JSON/Text Output"]
E --> G["HTML Report"]
Qdrant Integration
Install optional dependency:
python3 -m pip install qdrant-client
Use connectors.qdrant.TrackedQdrant to upsert vectors and record trace metadata in one step.
Demo Assets
- Report screenshot:
docs/assets/report-screenshot.svg - JSON sample:
docs/examples/ask-trace-sample.json - Terminal tape:
demo/vectrace-demo.tape - Terminal GIF (generate):
brew install vhs
./scripts/make_terminal_demo_gif.sh
Release / CI
- CI:
.github/workflows/ci.yml - Release publish:
.github/workflows/release.yml - PyPI checklist:
docs/PYPI_RELEASE.md - Build script:
scripts/build_dist.sh
Development Tests
python3 -m unittest discover -s tests -v
.venv/bin/python -m unittest discover -s tests -v
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vectrace-0.1.0.tar.gz.
File metadata
- Download URL: vectrace-0.1.0.tar.gz
- Upload date:
- Size: 36.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27b993f9b4a1e956dfa33f76915d3dbb264870de174611bcae4e6a2b94a7a483
|
|
| MD5 |
2a5d19dfbd858000152c623463d1a39b
|
|
| BLAKE2b-256 |
5415bd816a09145ad49c54630fe414c39b94978076f8d6586c53bf66ceb3c9b0
|
Provenance
The following attestation bundles were made for vectrace-0.1.0.tar.gz:
Publisher:
release.yml on kraftaa/vectrace
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vectrace-0.1.0.tar.gz -
Subject digest:
27b993f9b4a1e956dfa33f76915d3dbb264870de174611bcae4e6a2b94a7a483 - Sigstore transparency entry: 1406372799
- Sigstore integration time:
-
Permalink:
kraftaa/vectrace@77131e70943cbb16b33cfd3c90370f3df8952448 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@77131e70943cbb16b33cfd3c90370f3df8952448 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vectrace-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vectrace-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
932bccb10789be240e0bfa9fde6998a98547967edebae9b350be7c918feff3b6
|
|
| MD5 |
7fdf1ce7c38979c151764e3dc68618ab
|
|
| BLAKE2b-256 |
18166b3582b8f3e8ccbf969e7f50723d703fcd891814b8a230db255a1430e2e0
|
Provenance
The following attestation bundles were made for vectrace-0.1.0-py3-none-any.whl:
Publisher:
release.yml on kraftaa/vectrace
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vectrace-0.1.0-py3-none-any.whl -
Subject digest:
932bccb10789be240e0bfa9fde6998a98547967edebae9b350be7c918feff3b6 - Sigstore transparency entry: 1406372872
- Sigstore integration time:
-
Permalink:
kraftaa/vectrace@77131e70943cbb16b33cfd3c90370f3df8952448 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@77131e70943cbb16b33cfd3c90370f3df8952448 -
Trigger Event:
push
-
Statement type: