CI for AI agents - behavioral fingerprinting and drift detection

These details have not been verified by PyPI

Project links

Project description

Spooled — Behavioral CI for AI Agents

One prompt edit quietly turned this customer-support agent into a refund machine. Spooled caught it on the PR.

A PM asks for "a more helpful tone for frustrated customers." An engineer adds one sentence to the system prompt: "Resolve their issue when possible." Unit tests pass. The reviewer approves. The PR is ready to merge.

But the LLM now interprets "resolve" liberally. On complaint tickets, the agent stops escalating refund requests to humans and starts issuing refunds itself. The structure changed even though the prompt looked harmless.

Spooled diffs the agent's behavior against the committed baseline and posts this on the PR:

🚨 Merge blocked: agent now calls `issue_refund`

This tool was never observed in the baseline. It appears in
2 of 5 traces in this PR (~40%).

Triggered by a one-sentence change to the system prompt.

Caught content-blind — Spooled compared tool graphs, not language. It never saw a customer message or an LLM response.

Run it yourself in 60 seconds

pip install spooled-ai
spooled demo

Runs the entire scenario in your terminal — no API key, no setup, no files left behind. The variant agent differs from the baseline by exactly one line in the system prompt. The code is otherwise identical.

What It Does

Capture — wraps your LLM client and records the structural fingerprint of every agent run: which tools were called, in what order, how many times. Content-blind by architecture — prompts, customer data, and AI responses never leave your infrastructure.

Compare — diffs the current run against a committed baseline. Shows exactly what changed: tools added, tools removed, sequence reordered, token usage shifted.

Gate — posts a PR comment with the human-readable consequence as the headline. Blocks the merge if the policy says so. Resolution instructions included.

Install

pip install spooled-ai

Quick Start

import spooled
from spooled.wrappers import wrap_openai
from openai import OpenAI

spooled.init(agent_id="my_agent")
client = wrap_openai(OpenAI())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Analyze this deal"}],
    tools=MY_TOOLS,
)

spooled.shutdown()

That's it. Every tool call is captured. The trace is saved to .spooled/traces/. The hash chain signs every interaction at capture time.

CI Integration

# .github/workflows/spooled.yml
- name: Generate traces
  run: python ci_runner.py
  env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

- name: Spooled behavioral check
  run: |
    pip install spooled-ai
    spooled ci compare .spooled/traces/*.jsonl \
      --baseline .github/baselines \
      --policy spooled-policy.yml \
      --enable-blocking

Example PR comment:

## ❌ Spooled Behavioral CI: FAIL
> Spooled Score: 59/100 (D) 🔴

> [!CAUTION]
> ## 🚨 Merge blocked: agent now calls `issue_refund`
>
> This tool was **never observed in the baseline**. It appears in
> **2 of 5** traces in this PR (~40%).

**5** traces analyzed  |  ✅ **3** passed  |  ❌ **2** policy failures

### Trace Results
| Agent          | Fingerprint     | Status        | Score |
|----------------|-----------------|---------------|-------|
| support_agent  | `4d893b5cef...` | ⚠️ Behavior change | 59 |

<details>
  <summary>🔧 Tool Changes (2 traces)</summary>

  - ➕ `issue_refund` added
  - ➖ `escalate_to_human` removed
</details>

What Spooled Catches

Change type	Example	Unit tests	Spooled
Prompt tweak	"Be concise" drops compliance tools	✅ Pass	Behavior change
Model swap	Model drops sanctions screening	✅ Pass	Behavior change
Tool deprecation	Agent proceeds without critical data	✅ Pass	Behavior change
KB refresh	Ticket response path changes	✅ Pass	Behavior change
Schema migration	Field rename breaks detection	✅ Pass	Behavior change
Upstream degradation	Retry paths appear in fingerprint	✅ Pass	Behavior change

Content-Blind Architecture

Spooled never captures prompts, customer data, or AI responses. Only structural metadata: tool names, call sequence, token counts, timing. This is enforced in code — content is stripped before the trace reaches disk.

Supported Libraries

LLM Providers (explicit wrappers):

OpenAI (sync/async, streaming)
Anthropic (sync/async, streaming)

HTTP & Cloud (auto-instrumented via hooks):

AWS Bedrock
requests, httpx, aiohttp

Frameworks (callback handlers):

LangChain, LlamaIndex, AutoGen, CrewAI, LangGraph

Documentation

License

Proprietary.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.0

May 26, 2026

This version

0.5.1

May 22, 2026

0.5.0

May 17, 2026

0.4.4

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spooled_ai-0.5.1.tar.gz (243.1 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

spooled_ai-0.5.1-py3-none-any.whl (288.5 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file spooled_ai-0.5.1.tar.gz.

File metadata

Download URL: spooled_ai-0.5.1.tar.gz
Upload date: May 22, 2026
Size: 243.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for spooled_ai-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`ab91577f1177b301024234c65c97615ffcce9a378e0f579cc1b91bb87ca52dca`
MD5	`68c0b0d7114620c10ac4cd17daf85c1d`
BLAKE2b-256	`d00daceb49604449dd1cd7d1d6e7c546a1d6da92422708e7d09b556deb55be09`

See more details on using hashes here.

File details

Details for the file spooled_ai-0.5.1-py3-none-any.whl.

File metadata

Download URL: spooled_ai-0.5.1-py3-none-any.whl
Upload date: May 22, 2026
Size: 288.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for spooled_ai-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`44160568c86435f18931b9a1e2871f1931ff93d7cc49e0c2370440866d4556ff`
MD5	`9666f89e865e3ce0349eb4cae81bd40e`
BLAKE2b-256	`5b979595ce4272c5a6426cdc01b009525792c34cfc3b232d3e0c39db0baa6fdd`

See more details on using hashes here.

spooled-ai 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Spooled — Behavioral CI for AI Agents

Run it yourself in 60 seconds

What It Does

Install

Quick Start

CI Integration

What Spooled Catches

Content-Blind Architecture

Supported Libraries

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes