Skip to main content

Reliability layer for AI agent workflows: validate state, handoffs, and outcomes before agents continue.

Project description

agent-consistency

Catch false-success bugs in AI agent workflows.

agent-consistency is a lightweight Python reliability layer for workflows where agents read state, hand off context, call tools, and claim real-world outcomes. It validates state reads, handoff contracts, proof artifacts, and outcome checks before the workflow continues.

Agent workflows can look successful while acting on stale state, missing handoff facts, or unverified tool results. agent-consistency adds lightweight contracts and receipts so workflows prove they read the right state, passed the right context, and verified the real business outcome.

Install

python -m pip install agent-consistency

From a local checkout:

python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[dev]"

Tiny Example

from agent_consistency import WorkflowRun

run = WorkflowRun("refund-ord-1", on_violation="record")

with run.step("intake-agent", "read_ticket", step_id="intake") as step:
    order = {"id": "ord_1", "version": "order-v3", "previous_refund_count": 0}
    order_snapshot = step.read_state("order", order, version=order["version"])
    handoff = step.handoff(
        to_agent="refund-agent",
        task="issue refund",
        facts={"order_id": "ord_1", "amount": 42.5, "previous_refund_count": 0},
        evidence={"order.previous_refund_count": order_snapshot.to_dict()},
        required_facts=["order_id", "amount", "previous_refund_count"],
        required_evidence=["order.previous_refund_count"],
    )

with run.step("refund-agent", "issue_refund", step_id="refund") as step:
    step.consume_handoff(handoff)
    provider_result = {"refund_id": "rf_1", "status": "pending"}
    step.write_state("refund", provider_result, version="rf_1", include_value=True)
    step.verify_outcome(
        "refund_settled",
        lambda: provider_result["status"] == "settled",
        failure_reason="refund provider did not confirm settlement",
    )

receipt = run.receipts()[-1]
print(receipt.status)  # failed
print(receipt.issues[0].message)

The agent can call the tool, but the workflow does not get to claim completion until the provider confirms the refund is settled.

What It Verifies

  • State: which version of the order, policy, ticket, or record an agent read.
  • Handoff: whether required facts, assumptions, constraints, and evidence reached the next agent.
  • Proof artifacts: decisions, provider reads, approvals, files, tickets, or other evidence attached to a receipt.
  • Outcome verification: whether the business outcome became true after a side-effecting step.
  • Causality: which downstream step relied on which upstream handoff or artifact.

Why Output Validation Is Not Enough

Output validation can check whether a model response is shaped correctly. False-success bugs happen after that:

  • a policy agent approves from an old policy snapshot
  • a support handoff omits previous refund history
  • a tool returns 200 OK, but the provider status is still pending
  • a customer-visible message says "done" before the business outcome happened

agent-consistency focuses on proof before progression. It blocks unsafe continuation when state, handoff, or outcome verification fails.

When To Use It

Use it around side-effecting agent workflows:

  • refunds
  • approvals
  • customer support actions
  • payment operations
  • ticket escalation
  • account access changes
  • records updates
  • workflows that send customer-visible messages

Where It Fits

agent-consistency is complementary to orchestration and observability tools.

Tool category How it fits
LangGraph, CrewAI, AutoGen, custom orchestrators Wrap steps with receipt gates before moving to the next node.
Langfuse, Phoenix, OpenTelemetry tracing Keep traces; add contract and outcome checks for business correctness.
Guardrails and structured output validators Validate output shape; use this to verify state, handoffs, and side effects.
Policy engines Keep policy decisions; record the policy version and block stale reads.

It is not a replacement for your agent framework or tracing system. It is a reliability layer for workflows with side effects.

Architecture

Agent Consistency architecture flow

Reporting

Summarize a run directory, summary.json, or receipts.jsonl file:

agent-consistency report runs/demo-happy-refund
agent-consistency report runs/demo-pending-refund/receipts.jsonl --html report.html

The report command prints step status, issues, and outcome checks, and can write a small static HTML summary.

Examples

Run the included examples from a local checkout:

python examples/refund_workflow.py
python examples/approval_gate.py
python examples/tool_outcome_verification.py
python examples/stale_state_prevention.py
python examples/langgraph_style_wrapper.py

The agent_consistency.integrations module includes a small run_gated_step helper for wrapping LangGraph-style nodes, CrewAI tasks, AutoGen steps, or custom orchestrator functions.

Visual Demo

The companion demo is a browser-based Agent Reliability Control Center for a realistic refund workflow:

git clone https://github.com/karimbaidar/agent-consistency-refund-demo.git
cd agent-consistency-refund-demo
python -m pip install -r requirements-dev.txt
MODEL_PROVIDER=heuristic python -m uvicorn refund_demo.web:app --reload

Demo repo:

https://github.com/karimbaidar/agent-consistency-refund-demo

The key moment: the refund provider returns pending, so the workflow blocks the customer-facing "refund completed" message.

Development

python -m pip install -e ".[dev]"
python -m pytest
ruff check src tests examples

Build and check the package:

python -m build
python -m twine check dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_consistency-0.6.0.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_consistency-0.6.0-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file agent_consistency-0.6.0.tar.gz.

File metadata

  • Download URL: agent_consistency-0.6.0.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for agent_consistency-0.6.0.tar.gz
Algorithm Hash digest
SHA256 67c02c71cb088e4dffaa9c770b376ea962d2c57e297fe93e67fc3d32ef792e89
MD5 d87041222d975850de5515ac6df6b258
BLAKE2b-256 99e7a23702ae9633cefa1ef6d5250f571e226437e8db1014872ec91b4ef83b78

See more details on using hashes here.

File details

Details for the file agent_consistency-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_consistency-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 116f329279dad83f7bdc769ca5a4b7bf87a6929b2f13e11f93131d4d9b23acc0
MD5 679be9c97465e841f24c38a91cd3ae1d
BLAKE2b-256 c41c9238677870402baa0aa666f7ad79692713f084cc753304f27f37dff2ce54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page