Skip to main content

Record & replay the claude-agent-sdk wire for deterministic, offline tests.

Project description

Claude Agent Cassette

Record & replay the claude-agent-sdk wire for deterministic, offline tests — no API key, no subprocess, no mocks.

Why

Apps built on claude-agent-sdk read a stream of typed messages (assistant turns, tool results, task notifications, control-protocol frames) and drive logic off them. The nasty bugs live at that stream → your-handler seam: the SDK emits a slightly different shape than you expected, and your handler quietly does the wrong thing.

Mocked tests can't catch this — you build the mock, so you only test your understanding of your own mock. A cassette records the real wire once and replays it through the SDK's real parser, so:

  • a shape change in the SDK turns your test red instead of shipping to prod;
  • tests run with no API cost, no network, no claude subprocess;
  • the replayed frames go through the genuine message_parser, not a stand-in.
  PRODUCTION:   real CLI ──raw frames──► SDK parser ──► your code
                                              ▲
  REPLAY:       ReplayTransport ──raw frames──┘   (same parser, same code)

Install

pip install claude-agent-cassette   # (or: uv add claude-agent-cassette)

Replay (the common case — offline, no key)

from claude_agent_cassette import replay, load_cassette

async def test_my_handler():
    async with replay(load_cassette("tests/cassettes/happy_path.jsonl")) as client:
        kinds = [type(m).__name__ async for m in client.receive_messages()]
        assert "ResultMessage" in kinds
        # ...or feed client.receive_messages() into your own dispatcher and
        #    assert on what it produces.

A cassette is a JSONL file of raw inbound stream-json frames — the exact dicts the CLI emits. replay() injects them into a real ClaudeSDKClient and answers the SDK's initialize control handshake for you.

Record (capture a real session)

record_sdk_wire() works with both SDK entry points — the one-shot query() and the interactive ClaudeSDKClient (it patches both transport-construction sites the SDK uses):

from pathlib import Path
from claude_agent_cassette import record_sdk_wire, serialize_tape

# one-shot query()
from claude_agent_sdk import query

with record_sdk_wire() as tape:                  # tees the full duplex wire
    async for _ in query(prompt="...", options=...):
        pass
Path("session.jsonl").write_text(serialize_tape(tape))
# interactive ClaudeSDKClient
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient

with record_sdk_wire() as tape:
    async with ClaudeSDKClient(options=ClaudeAgentOptions()) as client:
        await client.query("...")
        async for _ in client.receive_messages():
            pass
Path("session.jsonl").write_text(serialize_tape(tape))

record_sdk_wire() captures both directions, including the control plane (control_request/control_response, mcp_message, hook_callback, the handshake), so one recording can feed both conversation replay and control-protocol replay. Derive a conversation cassette with conversation_messages(tape).

Drift detection (gate SDK bumps)

Re-parse a cassette's message frames through the installed SDK's own message_parser. A frame that no longer parses — or whose content blocks the parser silently drops — is flagged. Because it reuses the SDK's own parser, there is no schema to maintain: the judge is the thing being judged.

claude-agent-cassette drift tests/cassettes/      # *.jsonl files, or dirs of them
drift: 5 cassette(s) vs claude-agent-sdk 0.2.87

  ok    happy_path.jsonl
  DRIFT stop_midtask.jsonl — 1 frame(s):
          frame[3] assistant: content_dropped — 1 of 2 content block(s) dropped on parse
  ok    notification.jsonl

5 checked, 1 drifted (1 frame) — re-record the drifted cassettes.
  • Exits non-zero on drift — use it to gate an SDK-bump PR in CI.
  • Fails closed: if no cassette files are found it exits non-zero (a mispointed path can't pass as a false green); pass --allow-empty to override.
  • Three drift signals: parse_error (the parser rejected the frame), unrecognized_type (the message type is gone), content_dropped (a content block silently vanished).
  • Scope: catches parse-level drift (rejected/skipped frames) + dropped content blocks. It does not catch additive field-level drift (a still-parsing frame that gained a field) — see ROADMAP.md.

In Python: parse_drift(frames) / check_tape(tape)list[DriftFinding].

Examples

examples/ has a runnable, no-key demo:

python examples/replay_cassette.py
# AssistantMessage:
# ResultMessage: Hello! How can I help?

It replays the saved examples/cassettes/hello_world.jsonl through a real ClaudeSDKClient. (That cassette is a small, illustrative hand-written sample with realistic wire shapes; real cassettes are recorded — see above.)

API

replay(messages, options=None) async CM → a connected ClaudeSDKClient over a ReplayTransport
ReplayTransport(messages) raw frames → real parser (answers the initialize handshake)
RecordingTransport(inner, tape) passive MITM tee, both directions
record_sdk_wire() CM that wraps the SDK's transport to capture a query's wire
serialize_tape / load_tape / load_cassette tape & cassette I/O
read_frames(tape) / conversation_messages(tape) derive replay views from a tape
parse_drift(frames) / check_tape(tape) drift findings vs the installed SDK
claude-agent-cassette drift <path…> CLI drift gate (non-zero on drift / empty)

How it works (the non-obvious bits)

  • Replay rides the public Transport ABC (ClaudeSDKClient(transport=...), stable since SDK 0.0.22). It's solid across versions.
  • The initialize handshake: connect() writes a control_request with a fresh request_id and blocks until it sees a control_response echoing it. So ReplayTransport reads that id off write() and synthesises the response — otherwise replay hangs.
  • Record patches two sites: ClaudeSDKClient does a call-time import of the transport from its source module, while one-shot query() uses the name bound in _internal.client. Patching only one silently misses the other.

Compatibility

Replay uses only the public Transport API. Record and drift reach into claude_agent_sdk._internal (the subprocess transport, control-protocol shape, and message_parser), so they are version-sensitive — this release targets claude-agent-sdk 0.2.x. Pin your SDK and re-verify on bumps. (Drift being version-sensitive is the point: it tells you when a bump broke a cassette.)

Roadmap

See ROADMAP.md. Shipped: conversation replay, recording, Direction-A control replay (ReplayTransport.from_tape), and drift detection. Next up: faithful Direction-B control replay (can_use_tool/hook_callback/mcp_message stubbing) + interrupt lockstep, a pytest plugin with record-on-miss, field-level drift, and a redaction helper.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_agent_cassette-0.2.0.tar.gz (28.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claude_agent_cassette-0.2.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file claude_agent_cassette-0.2.0.tar.gz.

File metadata

  • Download URL: claude_agent_cassette-0.2.0.tar.gz
  • Upload date:
  • Size: 28.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for claude_agent_cassette-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6e291b592ef8d1749756b0c48cf54e9576d14576bb2009938b14cd3d3c56d62e
MD5 cb9fd6135e0d1ed141bac87bbb1d6a13
BLAKE2b-256 61aebea543e632d8b089c074ebc28ae70c9b064d78c686963264945c9a0bd55a

See more details on using hashes here.

File details

Details for the file claude_agent_cassette-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: claude_agent_cassette-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for claude_agent_cassette-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b8a4089c0d48e66b560db0d70af9522d28f9c8b03e7b61bdfd636726860ee257
MD5 52a2f168fb4e1065820b6b8ad8963ed9
BLAKE2b-256 9e4b04251fa4f4d67f141ae84da63c88ca8b33ad8c37b842376ccba2775d4584

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page