An open specification for versioning agent runtimes and keeping datasets valid.

These details have not been verified by PyPI

Project links

Project description

AgentVersion

Your agent changed. Is your saved data still valid?

agentversion turns an agent version into a diffable, hashable contract — so when prompts, tools, models, or graphs change, you know exactly what broke and which traces, eval sets, and training data survived.

When you ship a new version of an agent, everything you collected against the old one — production traces, eval datasets, SFT examples — quietly drifts out of date. There's no package.json to pin an agent's contract, and no git diff to tell you what changed. agentversion is that missing format: a JSON manifest describing an agent version, a diff that classifies every change as breaking or non-breaking, and a compatibility decision that tells you whether to keep, repair, replay, or drop your old data.

It's a dependency-light Python package with a CLI — and an open spec any tool can implement.

See it in action

Two production manifests of the same finance-agent, v1 and v2. One command:

$ agentversion diff finance-agent-v1.json finance-agent-v2.json --compat

                                     Manifest Diff
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Surface         ┃ Change Type  ┃ Details                                             ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ environment     │ non_breaking │ environment added                                   │
│ model_runtime   │ breaking     │ provider: 'google' → 'openai'                       │
│                 │              │ runtime_version: 'app-runtime@1.5.0' →              │
│                 │              │ 'app-runtime@1.8.2'                                 │
│                 │              │ envelope changed                                    │
│ output_contract │ breaking     │ format: 'text' → 'json'                             │
│                 │              │ strict: False → True                                │
│                 │              │ output schema changed                               │
│ prompt_stack    │ non_breaking │ system_prompt hash changed                          │
│                 │              │ developer_prompt hash changed                       │
│ subagents       │ breaking     │ subagents added: ['finance_subagent',               │
│                 │              │ 'spreadsheet_subagent']                             │
│ tool_registry   │ breaking     │ search_population removed                            │
│                 │              │ get_population added                                │
│                 │              │ write_spreadsheet_cell added                        │
│                 │              │ get_market_cap modified (non-schema)                │
│ workflow        │ breaking     │ graph topology changed                              │
│                 │              │ routing_policy_version: '2' → '4'                   │
│                 │              │ graph_version: '3' → '6'                            │
│                 │              │ graph_name: 'finance-simple-graph' →                │
│                 │              │ 'finance-router-graph'                              │
└─────────────────┴──────────────┴─────────────────────────────────────────────────────┘

  Breaking: 5  Non-breaking: 2

  Recommendation: replay
  Breaking changes in model_runtime, output_contract, subagents, tool_registry,
  workflow — existing data should be replayed against the new agent version.

Between v1 and v2 the team swapped the model (Google → OpenAI), renamed a tool, added two subagents, and switched to strict JSON output. agentversion caught all five breaking surfaces and told you the old traces need a replay — not a guess, a classification you can gate CI on.

Try it yourself — both manifests live in examples/manifest/.

Why an agent needs a version contract

You probably already have observability and a trace store. None of them answer "what is this agent version, and is my old data still compatible with the new one?"

You already have	What it gives you	What it doesn't
OpenTelemetry / LangSmith / Langfuse	rich execution traces	a versioned contract for the agent that produced them
A2A / ACP agent cards	runtime discovery + I/O types	version identity or data-compatibility
OpenAI JSONL / SFT files	a training format	provenance — which agent version produced each row

Isn't this A2A? No — and they compose. A2A and ACP answer "how does Agent A discover and talk to Agent B?". agentversion answers "what changed in this agent, and what does that mean for my data?". An A2A Agent Card can carry an agentversion manifest hash so you know both at once.

Install

pip install agentversion

Apache-2.0, no config — just needs Python 3.10+. It implements the frozen v1.0 spec, but the Python package itself is early: 0.1.0, pre-1.0, with the API still settling.

Quickstart

Diff two versions (table by default; add --json for machine output, --compat for a keep/repair/replay/drop recommendation):

agentversion diff old-manifest.json new-manifest.json --compat

Gate breaking changes in CI — --fail-on-breaking exits non-zero when any surface is breaking:

# .github/workflows/agent.yml
- name: Block breaking agent changes
  run: agentversion diff baseline-manifest.json current-manifest.json --fail-on-breaking

Scaffold, hash, and validate a manifest:

agentversion init                     # interactively create a manifest
agentversion hash manifest.json       # canonical JCS-SHA256 identity hash
agentversion validate manifest.json   # check it against the spec

Use it from Python — every line below is exercised by the test suite:

import json
from agentversion import AgentManifest, validate_manifest_file, hash_manifest
from agentversion.diff import diff_manifests
from agentversion.compatibility import classify_compatibility

old = json.load(open("finance-agent-v1.json"))
new = json.load(open("finance-agent-v2.json"))

# Validate + identify a version
assert validate_manifest_file("finance-agent-v2.json").valid
m = AgentManifest.model_validate(new)
print(m.agent_name, m.identity.overall_hash)   # finance-agent  sha256:767ebff1...

# Diff, then ask what to do with old data
result = diff_manifests(old, new)
print(result.summary.breaking_surfaces)                       # 5
print(classify_compatibility(result).recommended_decision)   # replay

What's in the box

A typed reference implementation: Pydantic models, canonical hashing, the diff/compatibility algorithms, and a CLI.

CLI

Command	What it does
`agentversion diff A B`	Classify changes by surface (`--json`, `--compat`, `--fail-on-breaking`)
`agentversion validate M`	Validate a manifest against the spec
`agentversion hash M`	Compute the canonical JCS-SHA256 hash
`agentversion init`	Scaffold a new manifest interactively
`agentversion upgrade M --to X`	Bump a manifest to a newer spec version
`agentversion {decision,replay,dataset} validate`	Validate the other spec objects

Library — top-level agentversion exports AgentManifest, validate_manifest / validate_manifest_file, hash_manifest / hash_surface, and SPEC_VERSION. The algorithms live in agentversion.diff and agentversion.compatibility; the other spec models live in agentversion.dataset, agentversion.replay, and agentversion.decision.

The manifest is organized as a contract surface per component — prompt_stack, model_runtime, tool_registry, skill_registry, workflow, subagents, output_contract, guardrails, context_config, environment — each independently hashed so the diff is surface-level and precise.

Use it anywhere — no platform required

The protocol is fully useful standalone:

Track versions locally — init to scaffold, hash for a stable id, diff between any two. No account, fully offline.
Gate CI/CD — diff --fail-on-breaking stops a breaking agent change from reaching production.
Annotate traces — stamp identity.overall_hash onto your OpenTelemetry spans as agentversion.manifest_hash for version-scoped filtering. See examples/integrations/otel_mapping.md.
Classify data compatibility — diff --compat (or decision generate) gives a per-episode keep / repair / replay / drop verdict you can act on.

It interoperates with LangSmith, Langfuse, Phoenix, and W&B — annotate their traces/datasets with a manifest hash, or read/write compatibility decisions alongside your eval pipeline.

The spec & conformance

agentversion is an open spec so any tool, in any language, can produce interoperable manifests and diffs:

spec/manifest.md — the agent manifest
spec/diff.md — surface diffs, breaking vs non-breaking
spec/compatibility-decision.md — keep / repair / replay / drop
spec/replay.md · spec/dataset.md — replay jobs and dataset objects with provenance
spec/reference.md — full schemas and validation rules · schemas/ — JSON Schemas

CONFORMANCE.md + compatibility-tests/ are golden in/out pairs that any implementation must reproduce to claim conformance.

Pairs with skillevaluation

A manifest can carry the eval results that gated its release in evaluation.gates[]:

{
  "evaluation": {
    "gates": [
      { "name": "regression-suite", "threshold": 0.95, "actual_score": 0.972, "passed": true }
    ]
  }
}

Those scores come from skillevaluation, the sibling open spec for A/B benchmarking skills. agentversion records what an agent version is; skillevaluation measures whether it's better.

The decimalai Python SDK builds on agentversion to add framework adapters (capture a manifest straight from your LangGraph/CrewAI app), trace capture, and managed replay — but you never need it to use the spec.

Project

The spec is stable at v1.0 — frozen wire format and conformance suite. The package is 0.1.0: pre-1.0 under semantic versioning, so the Python API may still shift before it catches up. Design decisions are logged in adrs/, releases in CHANGELOG.md. Contributions — especially new conformance cases — are genuinely welcome; see CONTRIBUTING.md:

git clone https://github.com/decimal-labs/agentversion
cd agentversion
pip install -e ".[dev]"
pytest

Licensed under Apache 2.0.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentversion-0.1.0.tar.gz (115.8 kB view details)

Uploaded May 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentversion-0.1.0-py3-none-any.whl (46.7 kB view details)

Uploaded May 30, 2026 Python 3

File details

Details for the file agentversion-0.1.0.tar.gz.

File metadata

Download URL: agentversion-0.1.0.tar.gz
Upload date: May 30, 2026
Size: 115.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for agentversion-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`caf81efcb4394de905b7aa8bf02b40f1dbe01a1f88bcad2b1b45beec40f45772`
MD5	`4a4ca1edf7a3010940fe1daf195dc19b`
BLAKE2b-256	`5d31cfbee63fb189a289c7b14baaa2ce50e7ffbc84bd348394d45e6b2a7ffdf0`

See more details on using hashes here.

File details

Details for the file agentversion-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentversion-0.1.0-py3-none-any.whl
Upload date: May 30, 2026
Size: 46.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for agentversion-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`db0ae3b8cecf48d7bce1562bc4eb492728378fa9e56204bda867b3248010d4b4`
MD5	`f8a9983145258768306b0533fe328f10`
BLAKE2b-256	`d9ee53efef72d1b2886f0db70483ce776edde92c78ff7fc681076c4750b9518d`

See more details on using hashes here.

agentversion 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentVersion

See it in action

Why an agent needs a version contract

Install

Quickstart

What's in the box

Use it anywhere — no platform required

The spec & conformance

Pairs with skillevaluation

Project

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes