Skip to main content

Elegant Python API for Claude CLI agents

Project description

gyre

Gyre

A Python orchestration layer over the Claude Code CLI. Gyre drives claude -p as a subprocess and gives you four primitives the official Agent SDK doesn't: check() as control flow, count-constrained typed list extraction, scoped Memory handles, and parallelism-capped loops with a dumpable call log.

What it is

Gyre is not an Anthropic API wrapper. It shells out to your locally installed claude CLI, inheriting the CLI's auth, its built-in tools (Read, Edit, Bash, Grep, …), MCP servers, and plan mode — without reimplementing any of them.

What Gyre adds is the loop:

  • check() as control flow — a yes/no Claude call that reads as ordinary Python (while not await agent.check(...)).
  • Typed extraction with count constraintagent.get(T, ...) and agent.get_list(T, ..., n=N) return Python values; n= injects minItems/maxItems into the JSON schema.
  • Memories as named handlesmemorize() returns a Memory object you pass into specific calls, scoped per call site rather than accumulating in a session buffer.
  • Budget, iteration, and parallelism capsagent_context(budget_usd=..., max_iterations=..., max_parallel=...) enforced across every call in the loop. The call that crosses raises after incurring its cost; that call still lands in the log.
  • A dumpable, replayable call log — every call records a CallRecord; agent.dump_log(path) writes JSON for inspection or eval. Pass that path back via agent_context(replay_from=...) and the next run is served from the log without invoking the CLI.

If you're calling the Anthropic API directly with an API key, Anthropic's official claude-agent-sdk is the right tool. If you're driving the claude CLI and want loops that read as Python, this is the library for that.

Installation

pip install gyre

Requires the claude CLI on PATH and Python ≥ 3.11.

Gyre calls claude -p directly. If claude --version works in your shell, gyre will work too. If not, install Claude Code first and run claude once interactively to authenticate.

Quick start

The smallest runnable example — save as a .py file, run with python:

import asyncio
from gyre import check, get


async def main() -> None:
    if await check("Is Python 3.11 or newer installed?"):
        version = await get[str]("What's the running Python version?")
        print(version)


asyncio.run(main())

Once that runs, the headline example below shows the loop shape this library is actually built for.

Headline example

A stewarded loop: generate candidates, score them, iterate until "good enough", under a budget.

import asyncio
from pydantic import BaseModel
from gyre import Tool, agent_context, memorize


class Draft(BaseModel):
    text: str


class Score(BaseModel):
    value: int
    reason: str


async def main() -> None:
    voice = memorize(
        "Brand voice: terse, no marketing fluff, no exclamation marks.",
        label="voice",
    )

    async with agent_context(
        tools=[Tool.Read],
        memories=[voice],
        max_iterations=20,
        budget_usd=2.00,
        max_parallel=4,
        model="sonnet",
    ) as agent:
        candidates = await agent.get_list(
            Draft, "Generate distinct intros for: kettle launch", n=8
        )

        while not await agent.check("Is the best candidate good enough to ship?"):
            scores = await asyncio.gather(*[
                agent.get(Score, f"Score this draft 1–10: {c.text}")
                for c in candidates
            ])
            survivors = [c for c, s in zip(candidates, scores) if s.value >= 7]
            if not survivors:
                break
            candidates = await agent.get_list(
                Draft,
                f"Produce 8 variations of these survivors: {[s.text for s in survivors]}",
                n=8,
            )

        winner = await agent.get(Draft, "Return the best candidate.")
        print(winner.text)
        print(agent.log_summary())
        agent.dump_log("run.json")


asyncio.run(main())

What the agent_context is doing for you:

  • Every call inside flows through one budget and one iteration counter. When cost_usd > 2.00 the next call raises BudgetExceededError after incurring its cost — the cost is recorded, not pre-paid.
  • max_parallel=4 is a real asyncio.Semaphore around run_claude, so the gather(...) fanout never has more than 4 in flight.
  • n=8 injects minItems/maxItems into the JSON schema and tells the model the target count in the prompt.
  • voice is read fresh from disk on every call and injected as <context label="voice">…</context> — edit the file and the next call picks it up.
  • Every call lands in agent.log() with kind, prompt, schema, tools, cost, and a truncated result; dump_log serializes it to JSON.

budget_usd=2.00 here is intentionally tight — set so the budget engages as a stop signal, not a worst-case bound. Raise it for longer sweeps; the iteration count it buys depends on prompt size and model.

Pydantic models

from pydantic import BaseModel
from gyre import get_list


class Task(BaseModel):
    title: str
    priority: int
    done: bool


tasks = await get_list[Task]("Extract all tasks from this document")

The schema is generated from the model; the response is validated against it before the value comes back.

Memories

memorize() writes a string to disk and returns a Memory handle. By default it writes to a temp file, which is fine for ad-hoc memories that live with the process. For memories you want under version control, pass path=:

from gyre import memorize

# Default — temp file, gone with the process.
voice = memorize("Brand voice: terse.", label="voice")
# Memory(label='voice', path=PosixPath('/tmp/gyre_memory_…'), created_at=…)

# Versioned — exact path under your repo.
voice = memorize(
    "Brand voice: terse.",
    label="voice",
    path="memories/voice.md",
)

Pass the handle via memories=[...] to any call (top-level, or via agent_context(memories=...), or Agent.with_memories(...)):

async with agent_context(memories=[voice]) as agent:
    await agent.check("On brand?")

Semantics, precisely:

  • The file is re-read on every call that uses the handle. Edit the file, the next call sees the new content.
  • The content is wrapped in <context label="LABEL">…</context> and appended to the user prompt.
  • path= writes to exactly that path (creating parent directories); directory= writes a gyre-named file in that directory; passing both is a ValueError. Without either, the file lands in the system temp directory.
  • A missing file at handle construction is FileNotFoundError; a file removed after construction raises on the next read (the failure is surfaced, not swallowed).
  • Memories are not deduplicated. Pass them once.
  • Memories do not persist across agent_context blocks — a new context starts with no implicit memories.

do() requires a tool scope

Because do() can mutate the filesystem or shell state, it refuses to run unless the tool list is locally readable:

# raises MissingToolScopeError — tools not visible at the call site
await do("Edit src/foo.py")

# OK — explicit at the call site
await do("Edit src/foo.py", tools=[Tool.Read, Tool.Edit])

# OK — explicit at the surrounding scope
async with agent_context(tools=[Tool.Read, Tool.Edit]) as agent:
    await agent.do("Edit src/foo.py")

# OK — explicit on the Agent
agent = Agent().with_tools(Tool.Bash)
await agent.do("ls")

check(), get(), get_list() don't carry the same rule — they return values, not actions.

For a side-effect-free preview, set dry_run=True. This forces PermissionMode.PLAN (overriding any permission_mode= you passed) and rejects any tool from MUTATING_TOOLS (Bash, Edit, Write, MultiEdit, NotebookEdit) at config time:

async with agent_context(tools=[Tool.Read, Tool.Grep], dry_run=True) as agent:
    plan = await agent.get(str, "What changes would you make to fix the test?")

Custom or MCP tools (raw strings, not Tool members) pass through dry_run unclassified — gyre can't tell whether they mutate.

Budget, iterations, parallelism

agent_context accepts three independent caps:

  • budget_usd: cumulative USD across the session.
  • max_iterations: number of successful calls.
  • max_parallel: in-flight call count.

Limits are checked after each successful call (post-pay, not pre-pay). The call that crosses raises:

  • BudgetExceededError(cost_usd, budget_usd)cost_usd is the actual cumulative cost including the failing call.
  • IterationLimitError(iterations, max_iterations).

Both errors leave the session log fully populated, including the offending call. So agent.log() after the raise tells you exactly what was attempted.

Costs come from the total_cost_usd field of the claude CLI's JSON output, accumulated per call — gyre doesn't estimate from token counts. The numbers are as accurate as the CLI's own cost reporting.

async with agent_context(budget_usd=2.00) as agent:
    try:
        while not await agent.check("done?"):
            await agent.get(Step, "next step")
    except BudgetExceededError as e:
        print(f"hit ${e.cost_usd:.4f} of ${e.budget_usd:.4f}")
        print(agent.log_summary())

max_parallel is implemented as an asyncio.Semaphore around each subprocess launch, so asyncio.gather(...) over agent calls naturally respects the cap.

Session log

Every call inside agent_context produces one CallRecord:

@dataclass(frozen=True)
class CallRecord:
    kind: str            # "check" | "do" | "get" | "get_list" | "raw"
    prompt: str
    type_name: str | None
    schema: dict | None
    tools: tuple[str, ...] | None
    memory_labels: tuple[str, ...]
    session_id: str
    cost_usd: float
    duration_ms: int
    timestamp: datetime
    result: str          # full CLI result, untruncated — for replay/eval
    result_summary: str  # truncated to ~500 chars, for `log_summary()`

Inspection:

agent.log()          # tuple[CallRecord, ...]
agent.log_summary()  # short text report
agent.dump_log("run.json")

result is the full untruncated CLI output, suitable for offline eval or as the input to a future replay primitive. result_summary is the 500-char display version used by log_summary(). dump_log writes both, so JSON files can be large for verbose runs — post-process if that matters for your storage.

Memory contents are not inlined — only labels and sha256 content hashes — so logs stay tractable even with large context files. The hashes let replay distinguish two records with the same memory labels but different content.

Replay

Pass a log path back via replay_from= and the session serves calls from the log instead of invoking the CLI:

from gyre import agent_context

async with agent_context(replay_from="run.json") as agent:
    # Same code path as the original run, but every call is matched
    # against the log by content hash and returned without hitting
    # Claude. cost_usd, iterations, and the new log mirror the original.
    ...

Matching is by sha256 of the full call signature: kind, prompt, schema, sorted tools, ordered (memory_label, memory_content_hash) pairs, system_prompt, append_prompt, model, permission_mode, and working_dir. Anything that affected the original output is in the key; tool order is normalized (it doesn't affect output) and memory order is preserved (it does).

Identical-input calls bucket FIFO: a gather(...) fanout of N identical prompts in the original run replays as N matches in insertion order, and an (N+1)-th call against an N-record log is a miss.

On a miss, the default is to raise ReplayCacheMissError with a short summary of the unmatched call. Pass replay_on_miss="passthrough" to fall through to a live call instead — useful when you're extending a run with one more step:

async with agent_context(
    replay_from="run.json", replay_on_miss="passthrough"
) as agent:
    # Existing calls hit cache; one new call goes live.
    ...

load_log(path) -> tuple[CallRecord, ...] is the lower-level primitive — useful when you want to read records yourself rather than replay. Old logs (pre-replay) load with default values for the new fields and only match calls with those same defaults.

Cost, iteration, and budget tracking apply during replay using the original costs, so a replayed run hits the same BudgetExceededError at the same point as the original. max_parallel is not enforced during replay (the semaphore is a real-call concept).

Top-level vs scoped

There are two ways to call gyre:

  • agent_context / Agent — the primary surface. Carries tools, memories, budget, parallelism, and the call log.
  • Top-level check/do/get/get_list — convenience for one-shot calls outside a loop. Inside an agent_context, they automatically inherit the session via a contextvar, so a gather(...) fanout of top-level calls is also budgeted. Outside any context, they run unbounded.
from gyre import Tool, check, do, get, get_list

if await check("Is this Python file syntactically valid?"):
    await do("Format with black", tools=[Tool.Bash])

count = await get[int]("How many functions are in main.py?")
todos = await get_list[str]("List all TODO comments")

The get[int]("...") subscript form is the one-liner ergonomic; on an Agent use the direct form: await agent.get(int, "...").

Reach for agent_context when you have more than one call. Reach for the top-level form when you have exactly one.

How Gyre compares

If you've used Anthropic's official claude-agent-sdk, the relationship is easy to describe. The Agent SDK is feature-rich and calls the Anthropic API via a bundled native binary; Gyre shells out to your installed claude CLI. The SDK already gives you typed structured output, max_budget_usd, max_turns, permission_mode="plan", allowed_tools, and full MCP support — so for most agentic work driven by an API key, it is what you want.

Gyre exists because four things are absent or awkward in the SDK:

  • check() as a boolean primitive. The SDK has no await check("?") -> bool. You'd define a Pydantic model with a single bool field and unwrap. Gyre makes the boolean call the unit so loops read as while not await agent.check(...).
  • get_list(..., n=N) count constraint. The SDK supports list extraction but not count constraints; you'd add minItems/maxItems to the schema yourself. Gyre injects them and tells the model the target count in the prompt.
  • Scoped memory handles. The SDK persists full conversation history to disk; it has no per-call labeled context blocks. Gyre's Memory handles are explicitly scoped — read fresh on each call, attached to the calls you choose, never the whole session buffer.
  • max_parallel cap. The SDK does not expose a concurrency cap. Gyre wraps each subprocess launch in an asyncio.Semaphore so gather(...) fanout naturally respects the limit.
  • Hash-cache replay. The SDK persists JSONL session histories but has no replay primitive. Gyre's agent_context(replay_from=path) matches calls against a dumped log by content hash of the full signature; misses raise by default, and identical-input duplicates bucket FIFO so parallel fanouts replay correctly.

There's also the architectural split. Gyre drives the CLI you've already authenticated, so it inherits whatever auth your claude install uses. The Agent SDK runs its own bundled binary and reads ANTHROPIC_API_KEY. Pick on auth and primitive needs, not feature count.

For pure structured-output libraries — Instructor, BAML — Gyre is in a different category. Those validate single-call typed responses; Gyre is built around the loop those calls live in.

Errors

Gyre raises a small hierarchy under GyreError:

  • AgentTimeoutError, AgentExecutionError — subprocess-level.
  • TypeExtractionError, SchemaValidationError — response parsing.
  • BudgetExceededError, IterationLimitError — session limits.
  • MissingToolScopeErrordo() called without a visible tool list.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gyre-0.1.0.tar.gz (77.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gyre-0.1.0-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file gyre-0.1.0.tar.gz.

File metadata

  • Download URL: gyre-0.1.0.tar.gz
  • Upload date:
  • Size: 77.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for gyre-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a776349f603c5876c268c1450d3d36f399cdbc9eafb0927812af07912696a1ed
MD5 17175d0e6600b1e15f05fb2da448e32e
BLAKE2b-256 80657d4b6b3d25fc8f8c9e6f3594c6d77c89a0f8c952634232c720d999bd5ef5

See more details on using hashes here.

File details

Details for the file gyre-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: gyre-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 35.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for gyre-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b0ba37e89ff9fecc06afa025839c49c2486760f162ae82bbc06042ecab4aee52
MD5 5436dacca3fbc3ef9e0010317a4038ac
BLAKE2b-256 f3e4782314aa8a071627bbc6893b4852274bd7c380d282eba650b3347703a1a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page