Skip to main content

Local-first SDK and CLI for RAG and agent reliability tracing, citation checks, and failure diagnosis.

Project description

ContextTrace

Debug RAG failures before users find them.

ContextTrace is a local-first Python SDK and CLI for evaluating existing RAG and AI agent systems. It records retrieved chunks, selected context, answer claims, citations, token usage, latency, and agent events, then writes local traces and HTML reports without requiring a hosted dashboard.

Install

pip install contexttrace
contexttrace --version
contexttrace init

Optional integrations:

pip install "contexttrace[langchain]"
pip install "contexttrace[llamaindex]"
pip install "contexttrace[fastapi]"
pip install "contexttrace[langgraph]"
pip install "contexttrace[otel]"
pip install "contexttrace[all]"

Quickstart

contexttrace init
contexttrace demo --dataset refund_policy
contexttrace report --last
contexttrace doctor

By default, traces are stored locally in:

.contexttrace/contexttrace.db

SDK Example

from contexttrace import ContextTrace

ct = ContextTrace(project="support-rag")

with ct.trace(query="What is the refund policy?") as trace:
    chunks = retriever.search("What is the refund policy?")
    trace.log_retrieval(chunks)
    trace.log_context(chunks[:5])

    answer = llm.generate("What is the refund policy?", chunks[:5])
    trace.log_answer(answer, usage={"total_tokens": 1200})
    trace.log_citations([
        {"claim": "Refunds are available within 30 days.", "source_chunk_id": "chunk_12"}
    ])

    result = trace.evaluate()
    print(result["failure"]["failure_type"])

BYO RAG Endpoint

Evaluate a running local or hosted RAG API without adding SDK code:

contexttrace eval \
  --dataset evals/questions.json \
  --endpoint http://localhost:8000/query \
  --method POST \
  --input-key question \
  --answer-path $.answer \
  --contexts-path $.contexts \
  --citations-path $.citations \
  --fail-on "failure_rate>0.25"

Claim-Level Evidence Verification

Verify a portable RAG trace artifact without a hosted dashboard:

contexttrace verify-demo unsupported_claim --report
contexttrace verify trace.json
contexttrace verify trace.json --json
contexttrace verify trace.json --report --out reports/example.html
contexttrace verify trace.json --mode semantic
contexttrace verify trace.json --fail-on unsupported --fail-on citation_mismatch
contexttrace verify-benchmark --mode semantic

Input requires query, answer, and contexts with id and text. Optional citations are checked to catch cited sources that do not actually support the matched claim.

verify-demo uses bundled demo traces, so it works immediately after pip install contexttrace. Available demos include unsupported_claim, partial_support, citation_mismatch, should_abstain, and supported_answer.

Use --mode semantic for local paraphrase-aware matching, and verify-benchmark to inspect bundled precision/recall metrics.

ContextTrace verifies whether each generated claim is actually supported by retrieved evidence. Instead of only showing a trace or a score, it tells you where the evidence chain broke: unsupported claim, citation mismatch, insufficient context, or should-have-abstained.

The v0.2.0 verifier uses local lexical heuristics by default. Claim extraction is rule-based, contradiction detection is conservative, and semantic or LLM-judge support can be added later.

What It Catches

  • retrieval_miss
  • citation_mismatch
  • unsupported_answer
  • contradicted_answer
  • conflicting_sources
  • should_have_abstained
  • agent failures such as stale_memory_used and tool_error

Privacy

Local mode is the default. ContextTrace makes no network calls unless you configure an LLM judge provider or evaluate a RAG endpoint you provide.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contexttrace-0.2.0.tar.gz (70.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

contexttrace-0.2.0-py3-none-any.whl (86.0 kB view details)

Uploaded Python 3

File details

Details for the file contexttrace-0.2.0.tar.gz.

File metadata

  • Download URL: contexttrace-0.2.0.tar.gz
  • Upload date:
  • Size: 70.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for contexttrace-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8008e2558060776764dc5bb06c54ff356da1d85266c631ab1abe8ff4c350975e
MD5 67bbe3247591f9dd2185721006356f9e
BLAKE2b-256 43e1a52673f86b4bd026a011b843a4b86635a144f983e2dd40fe8b085b2ea620

See more details on using hashes here.

File details

Details for the file contexttrace-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: contexttrace-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 86.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for contexttrace-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d2b8d11bda3c64c15e58385b4fe1742a28f5dcce057c0ae48c94458b7c03ac4a
MD5 7f24fb090d6cab4b159036f221d58a41
BLAKE2b-256 c1f6c5f481f4c852b0b22867713b929b98665eff6d9d6f224b339bc938de74f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page