sworn

Unopinionated contract-based verification for AI agents

Project description

Sworn

Unopinionated contract-based verification for AI agents.

The Problem

Typical deterministic software uses tools like unit tests to ensure the code functions correctly. However, as we find ourselves becoming more reliant on AI agents to do our work, we will need a smarter and more efficient means of verifying their output is correct. To fix this, this library introduces a mental model known as the Agentic Contract Framework.

Its primary function is to produce a contract with a set of commitments before the agents execution (this contract can be hardcoded, or be dynamically generated by the agent itself). Each commitment on a contract has an attached verifier, and this verifier can be set by you, the developer. If you deem a commitment can be deterministically verified, you are welcome to create a function for that (like a unit test). Otherwise, you can rely on the default semantic verifier that uses another agent to verify the correctness of the output. All the evaluations done will be collected and synced to the observability system of your choice.

Quick Start

from sworn import Contract, Commitment, DatadogObservability

# Define the contract with commitments
observer = DatadogObservability()
contract = Contract(
    observer=observer,
    commitments=[
        Commitment(
            name="no_harmful_content",
            terms="The agent must not produce harmful or offensive content"
        ),
        Commitment(
            name="stay_on_topic",
            terms="The agent must only discuss topics related to the user's query"
        )
    ]
)

# Decorate your tools
@contract.actuator
def send_message(content: str) -> dict:
    return {"status": "sent", "content": content}

# Run within an execution context
with contract.execution() as execution:
    # Run your agent (any framework)...
    send_message("Hello, world!")

    # Verify the execution
    results = execution.verify()
    print(results)

Progressive Hardening

Using this library is very much a process of continuous exploration, you observe your agents, determine their failure modes and progressively "harden" your rules. If you discover that your agent commonly does a certain mistake, you can simply create a commitment to not do that mistake and add a deterministic verifier to help catch it 100% of the times. I would personally recommend just starting with the default semantic verifier and understanding the failure modes of your agent in your domain first!

Architecture

classDiagram
    Contract --> Commitment
    Contract --> Observer
    Contract ..> Execution : creates
    Execution --> Contract
    Commitment --> Verifier
    Commitment --> SemanticVerifier
    Observer <|-- DatadogObservability

    class Contract {
        +commitments: List~Commitment~
        +observer: Observer
        +execution() Execution
        +actuator(func) Callable
        +sensor(func) Callable
    }

    class Execution {
        +tool_calls: List~ToolCall~
        +verify() List~VerificationResult~
        +add_tool_call(ToolCall)
        +format() str
    }

    class Commitment {
        +name: str
        +terms: str
        +verifier: Callable
        +semantic_sampling_rate: float
    }

    class Observer {
        <<interface>>
        +capture_span()
        +submit_evaluation()
    }

    class DatadogObservability {
        +capture_span()
        +submit_evaluation()
    }

    class Verifier {
        <<deterministic>>
    }

    class SemanticVerifier {
        <<LLM-based>>
    }

Key Concepts

Contracts and Commitments

The main contribution of this library is the Contract class. A contract stores many commitments. Think of a commitment as an expectation of what the agent is supposed to deliver. And a contract is a set of expectations. It's like a freelancer contract, but with your AI agent.

Verification Strategy

You build a contract by first defining it and adding commitments to it. A commitment can hold its own verifier. My approach to this is to be as critical as possible towards the output of the AI agent. If your verifier returns a violation, then it is taken that the agent failed to deliver what it committed to. But if your verifier returns a pass, then it is taken that the agent managed to pass a deterministic test case but could potentially have other failure modes that we do not know of (after all, using this library is a process of exploration). In this case, we run it against a semantic verifier to check for these unknown failure modes.

Sampling Rate

If you are confident that the deterministic test case is enough to account for all failure modes, you can set semantic_sampling_rate to be 0, meaning none of the agent executions for that particular commitment will be put through semantic verification (but your deterministic verification will still run). If you are more cost-conscious, you can set this to some number between 0 and 1 (the lower the number, the lesser the semantic verification that are done and the lesser the cost). If it is 1, then the semantic verifier will always run if (1) your verifier doesn't exist or (2) your verifier returned a pass.

Framework Agnostic Design

This library is pretty much framework agnostic, it doesn't lock you into any agentic framework (in fact, you can swap out frameworks without changing anything about the contract). There is a clear boundary set for this library and that is anything before your agent starts running, and anything after your agent finishes execution. This makes it independent of the execution. It traces tool calls during the agent runtime (outside of its boundary) by intercepting at the tool call level, meaning if your agent calls Python functions, then the library can trace it. If there are things that can't be traced (like agent's final output or reasoning), you can simply add it using execution.add_tool_call().

Execution Context

You may need to capture some context so that you can access it within your verifier. To do this you can use the add_context() method. The method only accepts a string now, but I'm planning to add support for structured data to allow deterministic verifers to adapt to different contexts.

Contract Coverage

Similar to how we compute coverage of tests on a codebase, it would also be interesting and useful to compute the coverage of your contracts on the agent's behaviour. Verifiers report the behaviour that they enforce and cover, and the contract finds the complement of the union of all these coverages to find out potential blindspots in your contract and behaviours that are not enforced. You can use this information to further tighten your contract by adding more commitments.

Installation

pip install sworn

Set your environment variables:

# Required for the agent
export GEMINI_API_KEY=your_gemini_api_key

# Required for Datadog observability (optional)
export DD_LLMOBS_ENABLED=1
export DD_LLMOBS_ML_APP=your_app_name
export DD_LLMOBS_AGENTLESS_ENABLED=1
export DD_SITE=us5.datadoghq.com
export DD_API_KEY=your_datadog_api_key
export DD_ENV=development
export DD_SERVICE=sworn

Project details

Release history Release notifications | RSS feed

This version

0.1.1

Dec 29, 2025

0.1.0

Dec 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sworn-0.1.1.tar.gz (14.5 kB view details)

Uploaded Dec 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sworn-0.1.1-py3-none-any.whl (12.7 kB view details)

Uploaded Dec 29, 2025 Python 3

File details

Details for the file sworn-0.1.1.tar.gz.

File metadata

Download URL: sworn-0.1.1.tar.gz
Upload date: Dec 29, 2025
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for sworn-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`09ce8ffd856393060c0a3a69f7ea396b1597da6c18cd015f996d12414c8e89df`
MD5	`6079f3c6318fccb3837dec1781f71126`
BLAKE2b-256	`09be64f015b81726a81764a7fb08b3f6902f05a0e4c3f49028e6c0d61b9754e4`

See more details on using hashes here.

File details

Details for the file sworn-0.1.1-py3-none-any.whl.

File metadata

Download URL: sworn-0.1.1-py3-none-any.whl
Upload date: Dec 29, 2025
Size: 12.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for sworn-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e0419cb4991211d2a00bc33b0660eea2e3d9929e2d37d27f0f4f2870c2b01f0b`
MD5	`ad0fc5e9ea1fad3eb0efa21eb0187118`
BLAKE2b-256	`283f23f456a1c0a9fa33cdbcbfc5a4cb40d2ca309c8d8cb5515f0555867f1272`

See more details on using hashes here.

sworn 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

The Problem

Quick Start

Progressive Hardening

Architecture

Key Concepts

Contracts and Commitments

Verification Strategy

Sampling Rate

Framework Agnostic Design

Execution Context

Contract Coverage

Installation

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes