Skip to main content

A formal framework for governing autonomous AI agents through explicit resource constraints and temporal boundaries

Project description

Agent Contracts

PyPI version Tests License

A formal framework for governing autonomous AI agents through explicit resource constraints and temporal boundaries.

Overview

Agent Contracts transforms autonomous AI agents from unbounded explorers into bounded optimizers by introducing formal contracts that specify:

  • ๐ŸŽฏ Resource Budgets - Tokens, API calls, compute time, and costs
  • โฑ๏ธ Temporal Constraints - Deadlines, duration limits, and lifecycle boundaries
  • ๐Ÿ“Š Success Criteria - Measurable conditions for contract fulfillment
  • ๐Ÿ”„ Lifecycle Management - Clear states from activation to termination

The Problem

Current agentic AI systems face critical challenges:

  • Unbounded Resource Consumption - Agents can consume unpredictable amounts of tokens, API calls, and compute time
  • Unclear Lifecycles - No explicit termination criteria, leading to resource leaks
  • Difficult Governance - Hard to audit, ensure compliance, and attribute costs
  • Coordination Complexity - Multi-agent systems lack formal resource allocation mechanisms

The Solution

Agent Contracts provide a mathematical framework that enables:

  • Predictable Costs - Explicit resource budgets prevent runaway consumption
  • Formal Verification - Contract states and constraints are machine-verifiable
  • Time-Resource Tradeoffs - Strategic optimization between speed and economy
  • Multi-Agent Coordination - Hierarchical contracts and resource markets

Quick Examples

Basic LLM Integration

from agent_contracts import Contract, ContractedLLM, ResourceConstraints, ContractMode

# Define a contract with resource budgets
contract = Contract(
    id="research-task",
    name="Research Assistant",
    mode=ContractMode.BALANCED,  # Optimize for quality-cost-time balance
    resources=ResourceConstraints(
        tokens=10000,
        api_calls=50,
        cost_usd=1.0
    )
)

# Execute LLM calls within contract constraints
with ContractedLLM(contract) as llm:
    response = llm.completion(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Summarize recent AI papers"}]
    )

# Contract automatically enforces:
# โœ… Token budget limits
# โœ… API call tracking
# โœ… Cost monitoring
# โœ… Violations trigger warnings or stops

Per-Tool Resource Limits

Fine-grained control over individual tool usage:

from agent_contracts import Contract, ResourceConstraints

contract = Contract(
    id="research-agent",
    name="Research Agent",
    resources=ResourceConstraints(
        tokens=10000,
        tool_invocations=20,  # Total limit across all tools
        per_tool_limits={
            "web_search": 5,   # Max 5 web searches
            "code_exec": 3,    # Max 3 code executions
            # Other tools limited only by aggregate
        }
    )
)

Pre-Execution Hooks (Custom Policy)

Add custom governance logic that runs before every constraint check:

from agent_contracts import (
    Contract, ContractedLLM, CheckContext, HookResult,
    EnforcementAction, ResourceConstraints,
)

# Define a hook that blocks off-topic requests
def topic_guard(ctx: CheckContext) -> HookResult:
    messages = ctx.metadata.get("messages", [])
    if any("off-topic" in str(m) for m in messages):
        return HookResult(
            allow=False,
            reason="Request outside allowed domain",
            action=EnforcementAction.HARD_STOP,
        )
    return HookResult()  # allow by default

contract = Contract(
    id="guarded-agent",
    resources=ResourceConstraints(tokens=10000, cost_usd=1.0)
)

with ContractedLLM(contract) as llm:
    llm.enforcer.add_pre_check_hook(topic_guard)
    # Hooks fire automatically on every LLM call
    # Works with all integrations: LiteLLM, LangGraph, Google ADK, Claude SDK

LangGraph Multi-Agent Workflows โญ

For complex workflows with cycles and multi-agent coordination:

from langgraph.graph import StateGraph, END
from agent_contracts import Contract, ResourceConstraints
from agent_contracts.integrations.langgraph import ContractedGraph

# Build complex graph with validation cycle
workflow = StateGraph(AgentState)
workflow.add_node("research", research_agent)
workflow.add_node("validate", validate_agent)
workflow.add_conditional_edges(
    "validate",
    should_retry,
    {True: "research", False: END}  # Can loop!
)
app = workflow.compile()

# Wrap with contract to prevent runaway loops
contract = Contract(
    id="research-workflow",
    resources=ResourceConstraints(
        tokens=50000,
        api_calls=25,  # Limit iterations!
        cost_usd=2.0
    )
)

contracted_workflow = ContractedGraph(contract=contract, graph=app)
result = contracted_workflow.invoke({"query": "Research topic"})

# Budget enforced across ALL nodes and cycles:
# โœ… Prevents infinite loops
# โœ… Multi-agent budget sharing
# โœ… Real-time violation detection
# โœ… Cumulative tracking across entire graph

Google ADK Multi-Agent Systems

For Google ADK-based agents and multi-agent hierarchies:

from google.adk.agents import LlmAgent
from agent_contracts import Contract, ResourceConstraints
from agent_contracts.integrations.google_adk import ContractedAdkAgent

# Create multi-agent hierarchy
researcher = LlmAgent(
    name="researcher",
    model="gemini-2.0-flash",
    instruction="You research topics thoroughly."
)

summarizer = LlmAgent(
    name="summarizer",
    model="gemini-2.0-flash",
    instruction="You create concise summaries."
)

coordinator = LlmAgent(
    name="coordinator",
    model="gemini-2.0-flash",
    instruction="You coordinate research and summarization.",
    sub_agents=[researcher, summarizer]
)

# Single budget for ENTIRE multi-agent system
contract = Contract(
    id="research-system",
    resources=ResourceConstraints(
        tokens=50000,  # For ALL agents combined
        api_calls=25,
        cost_usd=2.0
    )
)

contracted_system = ContractedAdkAgent(contract=contract, agent=coordinator)
result = contracted_system.run(
    user_id="user1",
    session_id="session1",
    message="Research and summarize quantum computing"
)

# Budget enforced across ALL agents in hierarchy:
# โœ… Detailed token tracking (prompt/response/thinking/cached)
# โœ… Multi-turn conversation protection
# โœ… Multi-agent coordination governance
# โœ… Tool execution monitoring

Contract Modes

Choose the mode that fits your requirements:

# URGENT mode: Minimize time, accept higher costs
contract = Contract(
    mode=ContractMode.URGENT,
    resources=ResourceConstraints(tokens=10000)
)
# โ†’ 50% faster execution, 20% more tokens

# BALANCED mode: Optimize quality-cost-time tradeoff
contract = Contract(
    mode=ContractMode.BALANCED,
    resources=ResourceConstraints(tokens=10000)
)
# โ†’ Standard execution with quality focus

# ECONOMICAL mode: Minimize costs, accept longer runtime
contract = Contract(
    mode=ContractMode.ECONOMICAL,
    resources=ResourceConstraints(tokens=10000)
)
# โ†’ 60% fewer tokens, 50% longer execution

Documentation

๐Ÿ“š Complete Documentation

Key Resources

  • Whitepaper - Complete theoretical framework with mathematical foundations
  • Pre-Execution Hooks - Custom governance hooks and behavioral monitor design
  • Examples - Coming soon: Practical implementation examples

Quick Start by Role

Key Concepts

Contract Definition

An Agent Contract C = (I, O, S, R, T, ฮฆ, ฮจ) includes:

  • I: Input specification
  • O: Output specification
  • S: Skills (tools, capabilities)
  • R: Resource constraints
  • T: Temporal constraints
  • ฮฆ: Success criteria
  • ฮจ: Termination conditions

Time-Resource Tradeoff

Agents can optimize along multiple dimensions:

Mode Time Resources Quality
Urgent Low โšก High ๐Ÿ’ฐ 85%
Balanced Medium โฑ๏ธ Medium ๐Ÿ’ต 95%
Economical High ๐Ÿข Low ๐Ÿ’ธ 90%

Contract States

DRAFTED โ†’ ACTIVE โ†’ {FULFILLED, VIOLATED, EXPIRED, TERMINATED}

Agent Skills (agentskills.io Standard)

Agent Contracts supports the agentskills.io open standard for defining reusable agent behaviors:

from agent_contracts import SkillSpec, Capabilities, Contract

# Define a rich skill with full instructions
code_review = SkillSpec(
    name="code-reviewer",
    description="Review code for best practices, security issues, and test coverage.",
    instructions="""
    ## Instructions
    1. Read the target files
    2. Check for common issues:
       - Error handling
       - Security vulnerabilities
       - Test coverage
    3. Provide detailed feedback
    """,
    allowed_tools=["Read", "Grep", "Glob"],
    version="1.0.0",
)

# Use in capabilities (mix strings and SkillSpec)
contract = Contract(
    id="review-task",
    name="Code Review",
    capabilities=Capabilities(
        skills=[code_review, "simple-skill"],  # Both types work
        tools=["web_search"],
    ),
)

# Access skills programmatically
skill = contract.capabilities.get_skill("code-reviewer")
print(skill.instructions)

Features:

  • โœ… Compatible with Microsoft, OpenAI, Cursor, and other adopters
  • โœ… SKILL.md import/export (to_skill_md(), from_skill_md())
  • โœ… Progressive disclosure (metadata vs full instructions)
  • โœ… Backward compatible (string skills still work)

Repository Status

๐ŸŽ‰ Ready for Release (November 2025)

Current Version: 0.1.0 Status: Production-ready, validated, documented

Phase 1: Core Framework โœ… Complete

  • โœ… Contract data structures (C = I, O, S, R, T, ฮฆ, ฮจ)
  • โœ… Resource monitoring and enforcement
  • โœ… Token counting and cost tracking
  • โœ… LiteLLM integration wrapper
  • โœ… 145 tests, 96% coverage
  • โœ… Live demo with Gemini 2.0 Flash

Phase 2A: Strategic Optimization โœ… Complete

  • โœ… Contract modes (URGENT, BALANCED, ECONOMICAL)
  • โœ… Budget-aware prompt generation
  • โœ… Strategic planning utilities
  • โœ… Quality-cost-time Pareto benchmark
  • โœ… 209 core tests passing

Phase 2B: Governance & Benchmarks โœ… Complete

  • โœ… Multi-step research benchmark (research agent with quality evaluation)
  • โœ… Budget violation policy testing (100% enforcement validation)
  • โœ… Cost governance validation (organizational policy compliance)
  • โœ… Variance reduction analysis (N=20 validation, temperature=0 effect discovered)
  • โœ… Quality metrics framework (3-phase validation study, CV=5.2%)
  • โœ… LangChain 1.0+ integration (governance & compliance)
  • โœ… Pre-commit hooks and code quality infrastructure

LangGraph Integration โœ… Complete (Premium Feature)

  • โœ… ContractedGraph for complex multi-agent workflows
  • โœ… Cumulative budget tracking across ALL nodes and cycles
  • โœ… Loop/retry protection (prevents runaway costs)
  • โœ… Multi-agent budget sharing
  • โœ… 27 comprehensive tests, 85% coverage
  • โœ… Real-world demos (validation cycles, parallel agents)

Google ADK Integration โœ… Complete

  • โœ… ContractedAdkAgent for Google ADK agents
  • โœ… Detailed token tracking (prompt, response, thinking, cached)
  • โœ… Multi-turn conversation protection
  • โœ… Multi-agent hierarchy governance
  • โœ… Tool execution monitoring
  • โœ… 11 comprehensive tests, 90% coverage
  • โœ… Real-world demos (multi-turn, multi-agent)

Claude Agent SDK Integration โœ… Complete

  • โœ… ContractedClaudeAgent with hook-based enforcement
  • โœ… Exact token tracking from AssistantMessage.usage
  • โœ… Per-tool limits and temporal enforcement via PreToolUse hooks
  • โœ… Audit trail via PostToolUse hooks
  • โœ… Full SDK passthrough (tools, MCP, subagents, skills, permissions)
  • โœ… Dual API: async aexecute() and sync execute()
  • โœ… 33 comprehensive tests

Pre-Execution Hooks โœ… Complete

  • โœ… User-defined pre/post-check hooks on ContractEnforcer
  • โœ… CheckContext, HookResult, CheckHook types for custom policy governance
  • โœ… Integration metadata pass-through (all 5 integrations)
  • โœ… Hook actions: WARN, THROTTLE (informational) and SOFT_STOP, HARD_STOP (blocking)
  • โœ… Post-check hooks are observational (cannot block)
  • โœ… Backward compatible โ€” existing code works unchanged

Evaluation Pipelines โœ… Complete

  • โœ… Research Pipeline: Multi-agent report generation (25 topics)
  • โœ… Code Review Pipeline: Coderโ†”Reviewer loop (175 LiveCodeBench problems)
  • โœ… CONTRACTED vs UNCONTRACTED comparison framework
  • โœ… Conservation law enforcement in multi-agent delegation
  • โœ… Iteration limits prevent runaway agent loops

Total: 646+ tests, 81%+ coverage

Use Cases

Agent Contracts are designed for:

  • Production AI Systems - Cost control and SLA compliance
  • Complex Multi-Agent Workflows โญ - LangGraph loops, retries, validation cycles
  • Enterprise Deployments - Governance, audit trails, and compliance
  • Claude Agent SDK - Govern Claude agents with per-tool limits and audit trails
  • Google ADK Applications - Multi-turn conversations and multi-agent hierarchies
  • LangChain Applications - Simple chains with budget enforcement
  • Research - Studying optimal agent behavior under constraints

Where Agent Contracts Shines

LangChain (simple chains):

  • 3-10 LLM calls per execution
  • Budget risk: LOW to MODERATE
  • Value: Governance, compliance, multi-call protection

LangGraph (complex workflows) โญ:

  • 30+ LLM calls per execution (cycles, retries, parallel agents)
  • Budget risk: VERY HIGH (can spiral to $10+ without limits!)
  • Value: Loop protection, multi-agent coordination, cumulative tracking
  • This is the killer feature for production deployments

Claude Agent SDK (agentic coding & file/web/terminal):

  • 10-100+ tool calls per session (Read, Edit, Bash, WebSearch, subagents)
  • Budget risk: HIGH (open-ended agents with many tools can spiral)
  • Value: Per-tool limits, temporal enforcement, audit trail, hook-based governance
  • Ideal for: Claude-powered agents, coding assistants, research agents

Google ADK (multi-turn & multi-agent):

  • 10-50+ LLM calls per conversation (turns, agent coordination, tool use)
  • Budget risk: HIGH (multi-agent hierarchies can explode costs)
  • Value: Multi-turn protection, hierarchical governance, detailed token tracking
  • Ideal for: Google Cloud deployments, Gemini-based agents, conversational AI

Project Structure

agent-contracts/
โ”œโ”€โ”€ src/agent_contracts/           # Core package
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ contract.py           # Contract data structures
โ”‚   โ”‚   โ”œโ”€โ”€ monitor.py            # Resource monitoring
โ”‚   โ”‚   โ”œโ”€โ”€ enforcement.py        # Constraint enforcement
โ”‚   โ”‚   โ”œโ”€โ”€ tokens.py             # Token counting
โ”‚   โ”‚   โ”œโ”€โ”€ planning.py           # Strategic planning
โ”‚   โ”‚   โ””โ”€โ”€ prompts.py            # Budget-aware prompts
โ”‚   โ””โ”€โ”€ integrations/
โ”‚       โ”œโ”€โ”€ litellm_wrapper.py    # LiteLLM integration
โ”‚       โ”œโ”€โ”€ langchain.py          # LangChain integration
โ”‚       โ”œโ”€โ”€ langgraph.py          # LangGraph integration โญ
โ”‚       โ”œโ”€โ”€ google_adk.py         # Google ADK integration
โ”‚       โ””โ”€โ”€ claude_agent_sdk.py   # Claude Agent SDK integration
โ”œโ”€โ”€ tests/                         # 247+ tests, 94%+ coverage
โ”‚   โ”œโ”€โ”€ core/                     # Core module tests (209 tests)
โ”‚   โ””โ”€โ”€ integrations/             # Integration tests (38 tests)
โ”œโ”€โ”€ benchmarks/                    # Live demonstrations & benchmarks
โ”‚   โ”œโ”€โ”€ demo_phase1.py            # Phase 1 interactive demo
โ”‚   โ”œโ”€โ”€ strategic/                # Strategic optimization benchmarks
โ”‚   โ”œโ”€โ”€ research_agent/           # Multi-step research benchmark
โ”‚   โ”œโ”€โ”€ governance/               # Policy & governance tests
โ”‚   โ”œโ”€โ”€ langchain/                # LangChain demos
โ”‚   โ”œโ”€โ”€ langgraph/                # LangGraph demos (multi-agent)
โ”‚   โ””โ”€โ”€ google_adk/               # Google ADK demos (multi-turn, multi-agent)
โ”œโ”€โ”€ evaluation/                    # Experimental evaluations
โ”‚   โ”œโ”€โ”€ research_pipeline/        # Multi-agent research experiment
โ”‚   โ””โ”€โ”€ code_review_pipeline/     # Coderโ†”Reviewer experiment
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ whitepaper.md             # Complete theoretical framework
โ”‚   โ””โ”€โ”€ testing-strategy.md       # Testing & validation plan
โ”œโ”€โ”€ pyproject.toml                 # Package configuration
โ””โ”€โ”€ README.md                      # This file

Installation

# Install from PyPI
pip install ai-agent-contracts

# Or with uv
uv add ai-agent-contracts

The package is importable as agent_contracts:

from agent_contracts import Contract, ResourceConstraints

For development (from source):

git clone https://github.com/flyersworder/agent-contracts.git
cd agent-contracts
uv sync --dev

Requirements: Python โ‰ฅ 3.12

Optional dependencies:

  • litellm - For LLM integration (automatically installed)
  • langchain - For LangChain integration (uv sync --extra langchain)
  • langgraph - For LangGraph integration โญ (uv sync --extra langgraph)
  • google-adk - For Google ADK integration (uv sync --extra google-adk)
  • claude-agent-sdk - For Claude Agent SDK integration (uv sync --extra claude-agent-sdk)
  • matplotlib - For visualization benchmarks (pip install matplotlib)

Development

Setup

This project uses uv for dependency management. To set up the development environment:

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository
git clone https://github.com/flyersworder/agent-contracts.git
cd agent-contracts

# Install dependencies (including dev dependencies)
uv sync --dev

# Install pre-commit hooks
uv run pre-commit install

Code Quality

This project uses several tools to maintain code quality:

  • Ruff: Fast Python linter and formatter (replaces black, isort, flake8)
  • mypy: Static type checker
  • pre-commit: Git hooks for automated checks

Pre-commit hooks will automatically run on every commit. To manually run all checks:

# Run all pre-commit hooks
uv run pre-commit run --all-files

# Run specific tools
uv run ruff check .                    # Linting
uv run ruff format .                   # Formatting
uv run mypy .                          # Type checking

Running Tests

# Run tests (when available)
uv run pytest

# Run with coverage
uv run pytest --cov=agent_contracts --cov-report=html

Project Structure

  • docs/ - Documentation (whitepaper, testing strategy)
  • src/ - Source code (planned)
  • tests/ - Test suite (planned)
  • pyproject.toml - Project configuration and dependencies
  • uv.lock - Locked dependencies for reproducibility

Contributing

This is an evolving framework. We welcome contributions in:

  • Reference implementations (Python, TypeScript)
  • Integration with existing frameworks (LangChain, AutoGPT, etc.)
  • Practical examples and tutorials
  • Empirical studies and benchmarks

License

This project is licensed under CC BY 4.0.

Authors

Qing Ye (with assistance from Claude, Anthropic)

Citation

If you use this framework in your research, please cite:

@techreport{ye2025agentcontracts,
  title={Agent Contracts: A Resource-Bounded Optimization Framework for Autonomous AI Systems},
  author={Ye, Qing},
  year={2025},
  month={October}
}

Learn More


Version: 0.3.0 | Last Updated: March 28, 2026 | Status: Production Ready โญ

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_agent_contracts-0.3.1.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_agent_contracts-0.3.1-py3-none-any.whl (99.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_agent_contracts-0.3.1.tar.gz.

File metadata

  • Download URL: ai_agent_contracts-0.3.1.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ai_agent_contracts-0.3.1.tar.gz
Algorithm Hash digest
SHA256 e4efb15fcfd84f425921b0184fba15416bb333536d29d9dfa204e2022e1293ac
MD5 422da39862e950433e0ce176847250e0
BLAKE2b-256 cac8fd4b74b482730c4a74f42bcdc7827cfd2ce6456978ec53846a34853bed44

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_agent_contracts-0.3.1.tar.gz:

Publisher: ci.yml on flyersworder/agent-contracts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_agent_contracts-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_agent_contracts-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4aca672989fcbda1a56b1424e1d7008dd0b885f348b44f7a4ef34a94db6515bd
MD5 3c80086846b8f11ac71fe449507e762c
BLAKE2b-256 41494a9ce7a56f5ef01745a90451644f4bc4068d17013fe3572c7bb14613fe35

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_agent_contracts-0.3.1-py3-none-any.whl:

Publisher: ci.yml on flyersworder/agent-contracts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page