Provenance tracking and citation verification for pydantic-ai agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

dugarsumit

These details have not been verified by PyPI

Project description

pydantic-ai-provenance

Provenance tracking and citation verification for pydantic-ai agents.

Attach ProvenanceCapability to any pydantic-ai agent and get:

A full execution DAG — every tool call, model request, and response linked in a directed acyclic graph.
Automatic citation keys (d_1, d_2, a_1, …) injected into source tool results so the LLM can cite them inline.
Multi-agent attribution — subagent outputs propagate through a shared store via contextvars, enabling transitive citation resolution across agent boundaries.
Citation verification — TF-IDF cosine overlap (Step 2) and optional LLM entailment (Step 3) to validate every [REF|…] tag in the final output.
Graph visualisation — export as Mermaid, GraphViz DOT, or JSON.

Installation

pip install pydantic-ai-provenance

Or with uv:

uv add pydantic-ai-provenance

Install directly from GitHub (latest development version):

pip install git+https://github.com/dugarsumit/pydantic-ai-provenance.git

uv add git+https://github.com/dugarsumit/pydantic-ai-provenance

Requirements: Python ≥ 3.12, pydantic-ai ≥ 1.80.

Quick start

import asyncio
from pydantic_ai import Agent
from pydantic_ai_provenance.capability import ProvenanceCapability
from pydantic_ai_provenance.attribution import attribute_output

provenance = ProvenanceCapability(
    agent_name="summariser",
    source_tools=["read_file"],   # tools whose results are raw data sources
)

agent = Agent(
    "anthropic:claude-sonnet-4-6",
    capabilities=[provenance],
    system_prompt="Summarise the content of files.",
)

@agent.tool_plain
def read_file(path: str) -> str:
    return open(path).read()

async def main():
    result = await agent.run("Read report.txt and summarise it.")

    store = provenance.store

    # Path-level attribution
    print(attribute_output(store).summary())

    # Mermaid diagram
    print(store.to_mermaid())

    # Citation verification (Steps 1 + 2, no extra API calls)
    report = await provenance.verify(result.output)
    print(report.text_with_verified_citations)

asyncio.run(main())

Citation format

The LLM is instructed to emit [REF|key] tags immediately after any claim derived from a source:

The report states revenue grew 12% YoY. [REF|d_1]

Multi-source claims use pipe-separated keys:

Both documents confirm the finding. [REF|d_1|d_2]

Multi-agent usage

Share the same ProvenanceCapability store across a coordinator and its subagents:

from pydantic_ai import Agent
from pydantic_ai_provenance.capability import ProvenanceCapability

research_cap = ProvenanceCapability(agent_name="researcher", source_tools=["fetch_url"])
coord_cap    = ProvenanceCapability(agent_name="coordinator")

research_agent = Agent("anthropic:claude-haiku-4-5-20251001", capabilities=[research_cap])
coord_agent    = Agent("anthropic:claude-sonnet-4-6",          capabilities=[coord_cap])

@research_agent.tool_plain
def fetch_url(url: str) -> str: ...

@coord_agent.tool
async def delegate(ctx, topic: str) -> str:
    result = await research_agent.run(f"Research: {topic}", usage=ctx.usage)
    return result.output

async def main():
    result = await coord_agent.run("Summarise pydantic-ai.")
    # Both agents share the same store automatically via contextvars
    store = coord_cap.store
    print(store.to_mermaid())

API reference

Core

Symbol	Description
`ProvenanceCapability`	pydantic-ai `AbstractCapability` that hooks into agent lifecycle
`ProvenanceStore`	Central registry: graph + citation key → node mapping

Graph primitives

Symbol	Description
`ProvenanceGraph`	DAG container with path traversal helpers
`ProvenanceNode`	Single execution step (id, type, label, data, timestamp)
`ProvenanceEdge`	Directed edge with optional label
`NodeType`	Enum: `INPUT`, `DATA_READ`, `TOOL_CALL`, `TOOL_RESULT`, `MODEL_REQUEST`, `MODEL_RESPONSE`, `AGENT_RUN`, `FINAL_OUTPUT`

Attribution

Symbol	Description
`attribute_output(store, output_node_id=None)`	Full path attribution for one `FINAL_OUTPUT` node
`attribute_all_outputs(store)`	Attribution for every `FINAL_OUTPUT`
`AttributionResult`	`.sources`, `.paths`, `.summary()`
`AttributionPath`	Single source-to-output path with `.hop_count`

Citations

Symbol	Description
`parse_citations(text)`	Extract all `[REF
`citation_tag_spans(text)`	Same but with `(start, end, CitationRef)` positions
`strip_inline_citation_tags(text)`	Remove all `[REF
`strip_inline_citation_tags_preserve_leading_ref_header(text)`	Strip body tags but keep an opening block header

Verification

Symbol	Description
`await verify_citations(text, store)`	Steps 1 (key sanitisation) + 2 (TF-IDF overlap)
`strip_unresolvable_citation_keys(text, store)`	Step 1 only: remove keys not in the store
`claim_source_tfidf_cosine(claim, source)`	Max cosine similarity over sliding source windows
`entailment_agent(model)`	Build a pydantic-ai Step 3 entailment judge
`refine_claim_source_similarities(records)`	Narrow results by top-N and min-score filters
`CitationVerificationReport`	`.original_text`, `.text_with_verified_citations`, `.claim_source_similarities`

Visualisation

Symbol	Description
`store.to_html(title="Provenance Graph")`	Self-contained interactive HTML page (Cytoscape.js)
`store.open_in_browser(title="Provenance Graph")`	Write HTML to a temp file and open in the default browser
`store.to_mermaid()`	Mermaid flowchart string
`store.to_dot(graph_name="provenance")`	GraphViz DOT string
`store.to_json()`	`dict` with `nodes` and `edges` lists
`store.to_json_str(indent=2)`	JSON string

Running the examples

# Offline citation verification (no API keys required)
uv run python examples/verify_citations.py

# Single-agent example
ANTHROPIC_API_KEY=... uv run python examples/single_agent.py
# or Azure OpenAI:
AZURE_OPENAI_ENDPOINT=https://... AZURE_OPENAI_API_KEY=... uv run python examples/single_agent.py

# Multi-agent example (opens interactive provenance graph in browser after the run)
ANTHROPIC_API_KEY=... uv run python examples/multi_agent.py

Development

git clone https://github.com/dugarsumit/pydantic-ai-provenance.git
cd pydantic-ai-provenance
uv sync --extra dev
uv run pytest
uv run ruff check .

See CONTRIBUTING.md for the full contributing guide.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

dugarsumit

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_ai_provenance-0.1.0.tar.gz (146.8 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pydantic_ai_provenance-0.1.0-py3-none-any.whl (36.0 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file pydantic_ai_provenance-0.1.0.tar.gz.

File metadata

Download URL: pydantic_ai_provenance-0.1.0.tar.gz
Upload date: May 11, 2026
Size: 146.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydantic_ai_provenance-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6cf844c5ffa6fe70213cc9163323039ef76ddcdc5a3e94e4ffa9c7a43e764eb5`
MD5	`9bff8ce47597c87ace8c99e7a0c7c669`
BLAKE2b-256	`dc40e39cd034ca408926a6e49c59e028a1fca6c639b271a5fb7b78f4d162d71a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_provenance-0.1.0.tar.gz:

Publisher: publish.yml on dugarsumit/pydantic-ai-provenance

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pydantic_ai_provenance-0.1.0.tar.gz
- Subject digest: 6cf844c5ffa6fe70213cc9163323039ef76ddcdc5a3e94e4ffa9c7a43e764eb5
- Sigstore transparency entry: 1508145736
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: dugarsumit/pydantic-ai-provenance@cb3111a211fdfe0571ad2945d832d235cee53fcf
- Branch / Tag: refs/tags/0.0.1
- Owner: https://github.com/dugarsumit
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@cb3111a211fdfe0571ad2945d832d235cee53fcf
- Trigger Event: release

File details

Details for the file pydantic_ai_provenance-0.1.0-py3-none-any.whl.

File metadata

Download URL: pydantic_ai_provenance-0.1.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 36.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydantic_ai_provenance-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`41b782525aa96801ece4ffe9da59f6f534aa06c7fd38c147c7bc2607a3f13931`
MD5	`8c027731282c4d11ae36f12dca4e4f59`
BLAKE2b-256	`010361a487090e0df3e652f9b987f971c888d599f5c4aef3fdbd522aa5f8ddb7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_provenance-0.1.0-py3-none-any.whl:

Publisher: publish.yml on dugarsumit/pydantic-ai-provenance

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pydantic_ai_provenance-0.1.0-py3-none-any.whl
- Subject digest: 41b782525aa96801ece4ffe9da59f6f534aa06c7fd38c147c7bc2607a3f13931
- Sigstore transparency entry: 1508146033
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: dugarsumit/pydantic-ai-provenance@cb3111a211fdfe0571ad2945d832d235cee53fcf
- Branch / Tag: refs/tags/0.0.1
- Owner: https://github.com/dugarsumit
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@cb3111a211fdfe0571ad2945d832d235cee53fcf
- Trigger Event: release

pydantic-ai-provenance 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

pydantic-ai-provenance

Installation

Quick start

Citation format

Multi-agent usage

API reference

Core

Graph primitives

Attribution

Citations

Verification

Visualisation

Running the examples

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance