Vector lineage tracking for RAG pipelines.

These details have not been verified by PyPI

Project description

VecTrace

Wrong RAG answer? Trace it to the exact vector, chunk, and source document.

vectrace is a CLI to debug a single wrong RAG answer by tracing it back to its exact source. Use it when one answer is wrong and you need to find the exact document and chunk that caused it.

Tagline:

vectrace — trace where your RAG answers come from

Start Here

If you only try one thing, run this:

vectrace ask-trace \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --top-k 3 \
  --output ./ask-trace.html \
  --json-output ./ask-trace.json

This runs retrieval, links results to source data, and generates a trace report.

This gives you:

ask-trace.html (shareable report)
ask-trace.json (machine-readable payload)

Output Preview

Ask Trace Report

What You Get

VecTrace links these layers in one place:

Retrieval context: question, answer, rank, score, metadata
Retrieval mode: exact (from your retriever telemetry) or bootstrap (best-effort lexical fallback)
Vector provenance: vector ID, embedding model, run info
Source evidence: chunk ID/index/snippet and source document path/version

Incident Example

Scenario:

Question: Can I get a refund after 90 days?
Answer: Yes, refunds are allowed.
Policy evidence: Refunds are only allowed within 30 days for eligible defects.

Run:

vectrace ask-trace \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --top-k 3 \
  --output ./incident-refund.html \
  --json-output ./incident-refund.json

Check in output:

evidence.support_status (expected: unsupported)
evidence.support_reason (shows why)
evidence.chunk_text + source_path for exact source proof

Core Commands

vectrace ask-trace ... for first-time debugging with no manual vector IDs.
vectrace trace ... when you already have a vector ID.
vectrace report ... when you want HTML from a known vector ID.

Quickstart

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install setuptools wheel
python3 -m pip install -e . --no-build-isolation

vectrace init --db ./vectrace.db
vectrace onboard --db ./vectrace.db --output ./trace-demo.html
vectrace ask-trace \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --top-k 3 \
  --output ./ask-trace.html \
  --json-output ./ask-trace.json

Advanced / Integration

Use these when integrating with your existing retriever/serving pipeline.

Auto-record from a Qdrant search (`TrackedQdrant.search_with_tracking`)

Skip the CLI for real RAG requests — wrap your Qdrant client and each search call records its own retrieval events:

from connectors.qdrant import TrackedQdrant

with TrackedQdrant(qdrant_url="http://localhost:6333", db_path="./vectrace.db") as rag:
    hits, query_id = rag.search_with_tracking(
        collection_name="support_kb",
        query_text="Can I get a refund after 90 days?",
        query_vector=embed("Can I get a refund after 90 days?"),
        limit=3,
        final_answer="Yes, refunds are allowed.",  # optional; can attach later
    )
    # ... pass `hits` to your LLM as usual

# Then debug the answer the same way you would with the CLI:
#   vectrace report-qa --db ./vectrace.db --collection support_kb \
#     --question "Can I get a refund after 90 days?" --output ./trace.html

Lineage writes are best-effort: if SQLite fails, the search still returns and the failure is logged to stderr. The returned query_id lets you correlate later commands or attach a final answer to the same retrieval event group.

Record real retrieval telemetry (`record-retrieval`)

To attach exact retriever rank/score/vector IDs from your app:

vectrace record-retrieval \
  --db ./vectrace.db \
  --collection support_kb \
  --vector-id vec_101 \
  --query-text "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --rank 1 \
  --score 0.87 \
  --evidence-text "Refunds are only allowed within 30 days for eligible defects." \
  --metadata-json '{"request_id":"req-123","session_id":"s-1"}'

Bootstrap from question+answer (`record-qa`)

If retriever telemetry was not logged, create best-effort retrieval events from stored chunk previews:

vectrace record-qa \
  --db ./vectrace.db \
  --collection support_kb \
  --question "Can I get a refund after 90 days?" \
  --final-answer "Yes, refunds are allowed." \
  --top-k 3

Query recorded events (`trace-qa`, `report-qa`)

vectrace trace-qa --db ./vectrace.db --question "Can I get a refund after 90 days?" --answer "Yes, refunds are allowed." --collection support_kb --format json
vectrace report-qa --db ./vectrace.db --question "Can I get a refund after 90 days?" --answer "Yes, refunds are allowed." --collection support_kb --output ./qa-trace.html --json-output ./qa-trace.json

Command Reference

Core

vectrace ask-trace --db ./vectrace.db --collection <name> --question "<query>" [--final-answer "<answer>"] [--top-k 3] [--match-index 1] --output ask-trace.html [--json-output ask-trace.json] [--redact-preview] [--redact-retrieval] [--metadata-json '{"k":"v"}'] [--format text|json]
vectrace trace --db ./vectrace.db --vector-id <id> [--collection <name>] [--format text|json] [--plain] [--redact-preview] [--redact-retrieval] [--include-retrieval]
vectrace report --db ./vectrace.db --vector-id <id> [--collection <name>] --output trace.html [--redact-preview] [--redact-retrieval] [--include-retrieval]

Advanced

vectrace init --db ./vectrace.db
vectrace onboard --db ./vectrace.db --output trace-demo.html
vectrace seed-demo --db ./vectrace.db --collection support_kb --vectors 200 --docs 20
vectrace connect --qdrant-url http://localhost:6333 --qdrant-collection support_kb
vectrace record-retrieval --db ./vectrace.db --collection <name> --vector-id <id> --query-text "<query>" [--final-answer "<answer>"] [--rank <n>] [--score <s>] [--evidence-text "<snippet>"] [--metadata-json '{"k":"v"}']
vectrace record-qa --db ./vectrace.db --collection <name> --question "<query>" --final-answer "<answer>" [--top-k 3] [--metadata-json '{"k":"v"}']
vectrace trace-qa --db ./vectrace.db --question "<query>" [--answer "<answer>"] [--collection <name>] [--top-k 3] [--format text|json] [--redact-preview] [--redact-retrieval]
vectrace report-qa --db ./vectrace.db --question "<query>" [--answer "<answer>"] [--collection <name>] [--top-k 5] [--match-index 1] --output qa-trace.html [--json-output qa-trace.json] [--redact-preview] [--redact-retrieval]

Output Shape (JSON)

trace --format json --include-retrieval and trace-qa --format json include:

retrieval: question, answer, rank, score, metadata
retrieval.trace_mode: exact or bootstrap
trace/lineage: vector/chunk/document chain
evidence: explicit snippet fields (chunk_text, source_path, etc.)

Privacy note:

--redact-retrieval redacts retrieval question/answer fields (including common metadata duplicates) and hides assessment text/details that could echo the original query.

ask-trace --format json emits selected match + query_id in one payload.

Architecture

flowchart LR
    A["RAG Ingest Pipeline"] --> B["VecTrace Tracker"]
    B --> C["SQLite Trace DB"]
    A --> D["Qdrant (vector values)"]
    E["vectrace ask-trace/trace/report/trace-qa/report-qa"] --> C
    E --> F["JSON/Text Output"]
    E --> G["HTML Report"]

Qdrant Integration

Install optional dependency:

python3 -m pip install qdrant-client

Use connectors.qdrant.TrackedQdrant to upsert vectors and record trace metadata in one step.

Demo Assets

Report screenshot: docs/assets/report-screenshot.svg
JSON sample: docs/examples/ask-trace-sample.json
Terminal tape: demo/vectrace-demo.tape
Terminal GIF (generate):

brew install vhs
./scripts/make_terminal_demo_gif.sh

Release / CI

CI: .github/workflows/ci.yml
Release publish: .github/workflows/release.yml
PyPI checklist: docs/PYPI_RELEASE.md
Build script: scripts/build_dist.sh

Development Tests

python3 -m unittest discover -s tests -v
.venv/bin/python -m unittest discover -s tests -v

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.4

May 13, 2026

0.1.3

May 10, 2026

0.1.2

May 9, 2026

0.1.1

May 9, 2026

0.1.0

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectrace-0.1.4.tar.gz (44.4 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vectrace-0.1.4-py3-none-any.whl (32.2 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file vectrace-0.1.4.tar.gz.

File metadata

Download URL: vectrace-0.1.4.tar.gz
Upload date: May 13, 2026
Size: 44.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrace-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`69fcca1fd074a1fd5b053244f6cea9f44b54873aefe1e6efef7e14cb25b0caea`
MD5	`5ca8c11fae3007d897d6acc1342fdf89`
BLAKE2b-256	`9794df3286a1d5ae041779670416d6d520deb0340c274e0348721cdae9957340`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectrace-0.1.4.tar.gz:

Publisher: release.yml on kraftaa/vectrace

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vectrace-0.1.4.tar.gz
- Subject digest: 69fcca1fd074a1fd5b053244f6cea9f44b54873aefe1e6efef7e14cb25b0caea
- Sigstore transparency entry: 1526616293
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: kraftaa/vectrace@37d75b68f74f28bd6fb675a1ccd3f38a2593be66
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/kraftaa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@37d75b68f74f28bd6fb675a1ccd3f38a2593be66
- Trigger Event: push

File details

Details for the file vectrace-0.1.4-py3-none-any.whl.

File metadata

Download URL: vectrace-0.1.4-py3-none-any.whl
Upload date: May 13, 2026
Size: 32.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrace-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`34ef57c62fe2c7c36981aac02322315355d2fc981f44173dbdd776b47a9cdba9`
MD5	`d330216d23caae50b3be618986663065`
BLAKE2b-256	`fad8a7be794e7d1b7355960cfc703977b82f5ff3169e4d1b2b9c65099f1813d6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectrace-0.1.4-py3-none-any.whl:

Publisher: release.yml on kraftaa/vectrace

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vectrace-0.1.4-py3-none-any.whl
- Subject digest: 34ef57c62fe2c7c36981aac02322315355d2fc981f44173dbdd776b47a9cdba9
- Sigstore transparency entry: 1526617018
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: kraftaa/vectrace@37d75b68f74f28bd6fb675a1ccd3f38a2593be66
- Branch / Tag: refs/tags/v0.1.4
- Owner: https://github.com/kraftaa
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@37d75b68f74f28bd6fb675a1ccd3f38a2593be66
- Trigger Event: push

vectrace 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

VecTrace

Start Here

Output Preview

What You Get

Incident Example

Core Commands

Quickstart

Advanced / Integration

Auto-record from a Qdrant search (TrackedQdrant.search_with_tracking)

Record real retrieval telemetry (record-retrieval)

Bootstrap from question+answer (record-qa)

Query recorded events (trace-qa, report-qa)

Command Reference

Core

Advanced

Output Shape (JSON)

Architecture

Qdrant Integration

Demo Assets

Release / CI

Development Tests

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Auto-record from a Qdrant search (`TrackedQdrant.search_with_tracking`)

Record real retrieval telemetry (`record-retrieval`)

Bootstrap from question+answer (`record-qa`)

Query recorded events (`trace-qa`, `report-qa`)