Skip to main content

Vector lineage tracking for RAG pipelines.

Project description

VecTrace

Wrong RAG answer? Trace it to the exact vector, chunk, and source document.

vectrace is a CLI to debug a single wrong RAG answer by tracing it back to its exact source. Use it when one answer is wrong and you need to find the exact document and chunk that caused it.

Tagline:

  • vectrace — trace where your RAG answers come from

Start Here

If you only try one thing, run this:

vectrace ask-trace \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --top-k 3 \
  --output ./ask-trace.html \
  --json-output ./ask-trace.json

This runs retrieval, links results to source data, and generates a trace report.

This gives you:

  • ask-trace.html (shareable report)
  • ask-trace.json (machine-readable payload)

Output Preview

Ask Trace Report

What You Get

VecTrace links these layers in one place:

  • Retrieval context: question, answer, rank, score, metadata
  • Retrieval mode: exact (from your retriever telemetry) or bootstrap (best-effort lexical fallback)
  • Vector provenance: vector ID, embedding model, run info
  • Source evidence: chunk ID/index/snippet and source document path/version

Incident Example

Scenario:

  • Question: Can I get a refund after 90 days?
  • Answer: Yes, refunds are allowed.
  • Policy evidence: Refunds are only allowed within 30 days for eligible defects.

Run:

vectrace ask-trace \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --top-k 3 \
  --output ./incident-refund.html \
  --json-output ./incident-refund.json

Check in output:

  • evidence.support_status (expected: unsupported)
  • evidence.support_reason (shows why)
  • evidence.chunk_text + source_path for exact source proof

Core Commands

  • vectrace ask-trace ... for first-time debugging with no manual vector IDs.
  • vectrace trace ... when you already have a vector ID.
  • vectrace report ... when you want HTML from a known vector ID.

Quickstart

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install setuptools wheel
python3 -m pip install -e . --no-build-isolation

vectrace init --db ./vectrace.db
vectrace onboard --db ./vectrace.db --output ./trace-demo.html
vectrace ask-trace \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --top-k 3 \
  --output ./ask-trace.html \
  --json-output ./ask-trace.json

Advanced / Integration

Use these when integrating with your existing retriever/serving pipeline.

Record real retrieval telemetry (record-retrieval)

To attach exact retriever rank/score/vector IDs from your app:

vectrace record-retrieval \
  --db ./vectrace.db \
  --collection support_kb \
  --vector-id vec_101 \
  --query-text "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --rank 1 \
  --score 0.87 \
  --evidence-text "Refunds are only allowed within 30 days for eligible defects." \
  --metadata-json '{"request_id":"req-123","session_id":"s-1"}'

Bootstrap from question+answer (record-qa)

If retriever telemetry was not logged, create best-effort retrieval events from stored chunk previews:

vectrace record-qa \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --top-k 3

Query recorded events (trace-qa, report-qa)

vectrace trace-qa --db ./vectrace.db --question "Can I get a refund after 90 days?" --answer "Yes, refunds are allowed." --collection support_kb --format json
vectrace report-qa --db ./vectrace.db --question "Can I get a refund after 90 days?" --answer "Yes, refunds are allowed." --collection support_kb --output ./qa-trace.html --json-output ./qa-trace.json

Command Reference

Core

  • vectrace ask-trace --db ./vectrace.db --collection <name> --question "<query>" [--final-answer "<answer>"] [--top-k 3] [--match-index 1] --output ask-trace.html [--json-output ask-trace.json] [--redact-preview] [--redact-retrieval] [--metadata-json '{"k":"v"}'] [--format text|json]
  • vectrace trace --db ./vectrace.db --vector-id <id> [--collection <name>] [--format text|json] [--plain] [--redact-preview] [--redact-retrieval] [--include-retrieval]
  • vectrace report --db ./vectrace.db --vector-id <id> [--collection <name>] --output trace.html [--redact-preview] [--redact-retrieval] [--include-retrieval]

Advanced

  • vectrace init --db ./vectrace.db
  • vectrace onboard --db ./vectrace.db --output trace-demo.html
  • vectrace seed-demo --db ./vectrace.db --collection support_kb --vectors 200 --docs 20
  • vectrace connect --qdrant-url http://localhost:6333 --qdrant-collection support_kb
  • vectrace record-retrieval --db ./vectrace.db --collection <name> --vector-id <id> --query-text "<query>" [--final-answer "<answer>"] [--rank <n>] [--score <s>] [--evidence-text "<snippet>"] [--metadata-json '{"k":"v"}']
  • vectrace record-qa --db ./vectrace.db --collection <name> --question "<query>" --final-answer "<answer>" [--top-k 3] [--metadata-json '{"k":"v"}']
  • vectrace trace-qa --db ./vectrace.db --question "<query>" [--answer "<answer>"] [--collection <name>] [--top-k 3] [--format text|json] [--redact-preview] [--redact-retrieval]
  • vectrace report-qa --db ./vectrace.db --question "<query>" [--answer "<answer>"] [--collection <name>] [--top-k 5] [--match-index 1] --output qa-trace.html [--json-output qa-trace.json] [--redact-preview] [--redact-retrieval]

Output Shape (JSON)

trace --format json --include-retrieval and trace-qa --format json include:

  • retrieval: question, answer, rank, score, metadata
  • retrieval.trace_mode: exact or bootstrap
  • trace/lineage: vector/chunk/document chain
  • evidence: explicit snippet fields (chunk_text, source_path, etc.)

Privacy note:

  • --redact-retrieval redacts retrieval question/answer fields (including common metadata duplicates) and hides assessment text/details that could echo the original query.

ask-trace --format json emits selected match + query_id in one payload.

Architecture

flowchart LR
    A["RAG Ingest Pipeline"] --> B["VecTrace Tracker"]
    B --> C["SQLite Trace DB"]
    A --> D["Qdrant (vector values)"]
    E["vectrace ask-trace/trace/report/trace-qa/report-qa"] --> C
    E --> F["JSON/Text Output"]
    E --> G["HTML Report"]

Qdrant Integration

Install optional dependency:

python3 -m pip install qdrant-client

Use connectors.qdrant.TrackedQdrant to upsert vectors and record trace metadata in one step.

Demo Assets

  • Report screenshot: docs/assets/report-screenshot.svg
  • JSON sample: docs/examples/ask-trace-sample.json
  • Terminal tape: demo/vectrace-demo.tape
  • Terminal GIF (generate):
brew install vhs
./scripts/make_terminal_demo_gif.sh

Release / CI

  • CI: .github/workflows/ci.yml
  • Release publish: .github/workflows/release.yml
  • PyPI checklist: docs/PYPI_RELEASE.md
  • Build script: scripts/build_dist.sh

Development Tests

python3 -m unittest discover -s tests -v
.venv/bin/python -m unittest discover -s tests -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectrace-0.1.1.tar.gz (38.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectrace-0.1.1-py3-none-any.whl (28.5 kB view details)

Uploaded Python 3

File details

Details for the file vectrace-0.1.1.tar.gz.

File metadata

  • Download URL: vectrace-0.1.1.tar.gz
  • Upload date:
  • Size: 38.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrace-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6793a9a82797a43ca5e02ac2504c3eb0fe2617e420baa1e76b6f595449a57f3c
MD5 9401675ab74a713e614fbad85ded3e34
BLAKE2b-256 dd5868966b208c08e615cbe7c919c2eee92673e9fcdf31a0ad5b404ef6206089

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectrace-0.1.1.tar.gz:

Publisher: release.yml on kraftaa/vectrace

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vectrace-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vectrace-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 28.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrace-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 416367901ef3414425c69f2ee8bc72733ac58b6584b111e080bcfa5f54d7e388
MD5 2f515a92d25e1b2d1915636f05c5084e
BLAKE2b-256 d3938e5ed0b88720ece363cba9e0dd8d64a9103ec2634d650fecb4b411d5f4a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectrace-0.1.1-py3-none-any.whl:

Publisher: release.yml on kraftaa/vectrace

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page