Elegant Python API for Claude CLI agents
Project description
Gyre
A Python orchestration layer over the Claude Code CLI. Gyre drives
claude -p as a subprocess and gives you four primitives the official
Agent SDK doesn't: check() as control flow, count-constrained typed
list extraction, scoped Memory handles, and parallelism-capped loops
with a dumpable call log.
What it is
Gyre is not an Anthropic API wrapper. It shells out to your locally
installed claude CLI, inheriting the CLI's auth, its built-in tools
(Read, Edit, Bash, Grep, …), MCP servers, and plan mode — without
reimplementing any of them.
What Gyre adds is the loop:
check()as control flow — a yes/no Claude call that reads as ordinary Python (while not await agent.check(...)).- Typed extraction with count constraint —
agent.get(T, ...)andagent.get_list(T, ..., n=N)return Python values;n=injectsminItems/maxItemsinto the JSON schema. - Memories as named handles —
memorize()returns aMemoryobject you pass into specific calls, scoped per call site rather than accumulating in a session buffer. - Budget, iteration, and parallelism caps —
agent_context(budget_usd=..., max_iterations=..., max_parallel=...)enforced across every call in the loop. The call that crosses raises after incurring its cost; that call still lands in the log. - A dumpable, replayable call log — every call records a
CallRecord;agent.dump_log(path)writes JSON for inspection or eval. Pass that path back viaagent_context(replay_from=...)and the next run is served from the log without invoking the CLI.
If you're calling the Anthropic API directly with an API key,
Anthropic's official claude-agent-sdk is the right tool. If you're
driving the claude CLI and want loops that read as Python, this is
the library for that.
Installation
pip install gyre
Requires the claude CLI on PATH and Python ≥ 3.11.
Gyre calls claude -p directly. If claude --version works in your
shell, gyre will work too. If not, install Claude Code first and run
claude once interactively to authenticate.
Quick start
The smallest runnable example — save as a .py file, run with
python:
import asyncio
from gyre import check, get
async def main() -> None:
if await check("Is Python 3.11 or newer installed?"):
version = await get[str]("What's the running Python version?")
print(version)
asyncio.run(main())
Once that runs, the headline example below shows the loop shape this library is actually built for.
Headline example
A stewarded loop: generate candidates, score them, iterate until "good enough", under a budget.
import asyncio
from pydantic import BaseModel
from gyre import Tool, agent_context, memorize
class Draft(BaseModel):
text: str
class Score(BaseModel):
value: int
reason: str
async def main() -> None:
voice = memorize(
"Brand voice: terse, no marketing fluff, no exclamation marks.",
label="voice",
)
async with agent_context(
tools=[Tool.Read],
memories=[voice],
max_iterations=20,
budget_usd=2.00,
max_parallel=4,
model="sonnet",
) as agent:
candidates = await agent.get_list(
Draft, "Generate distinct intros for: kettle launch", n=8
)
while not await agent.check("Is the best candidate good enough to ship?"):
scores = await asyncio.gather(*[
agent.get(Score, f"Score this draft 1–10: {c.text}")
for c in candidates
])
survivors = [c for c, s in zip(candidates, scores) if s.value >= 7]
if not survivors:
break
candidates = await agent.get_list(
Draft,
f"Produce 8 variations of these survivors: {[s.text for s in survivors]}",
n=8,
)
winner = await agent.get(Draft, "Return the best candidate.")
print(winner.text)
print(agent.log_summary())
agent.dump_log("run.json")
asyncio.run(main())
What the agent_context is doing for you:
- Every call inside flows through one budget and one iteration counter.
When
cost_usd > 2.00the next call raisesBudgetExceededErrorafter incurring its cost — the cost is recorded, not pre-paid. max_parallel=4is a realasyncio.Semaphorearoundrun_claude, so thegather(...)fanout never has more than 4 in flight.n=8injectsminItems/maxItemsinto the JSON schema and tells the model the target count in the prompt.voiceis read fresh from disk on every call and injected as<context label="voice">…</context>— edit the file and the next call picks it up.- Every call lands in
agent.log()with kind, prompt, schema, tools, cost, and a truncated result;dump_logserializes it to JSON.
budget_usd=2.00 here is intentionally tight — set so the budget
engages as a stop signal, not a worst-case bound. Raise it for longer
sweeps; the iteration count it buys depends on prompt size and model.
Pydantic models
from pydantic import BaseModel
from gyre import get_list
class Task(BaseModel):
title: str
priority: int
done: bool
tasks = await get_list[Task]("Extract all tasks from this document")
The schema is generated from the model; the response is validated against it before the value comes back.
Memories
memorize() writes a string to disk and returns a Memory handle. By
default it writes to a temp file, which is fine for ad-hoc memories
that live with the process. For memories you want under version
control, pass path=:
from gyre import memorize
# Default — temp file, gone with the process.
voice = memorize("Brand voice: terse.", label="voice")
# Memory(label='voice', path=PosixPath('/tmp/gyre_memory_…'), created_at=…)
# Versioned — exact path under your repo.
voice = memorize(
"Brand voice: terse.",
label="voice",
path="memories/voice.md",
)
Pass the handle via memories=[...] to any call (top-level, or via
agent_context(memories=...), or Agent.with_memories(...)):
async with agent_context(memories=[voice]) as agent:
await agent.check("On brand?")
Semantics, precisely:
- The file is re-read on every call that uses the handle. Edit the file, the next call sees the new content.
- The content is wrapped in
<context label="LABEL">…</context>and appended to the user prompt. path=writes to exactly that path (creating parent directories);directory=writes a gyre-named file in that directory; passing both is aValueError. Without either, the file lands in the system temp directory.- A missing file at handle construction is
FileNotFoundError; a file removed after construction raises on the next read (the failure is surfaced, not swallowed). - Memories are not deduplicated. Pass them once.
- Memories do not persist across
agent_contextblocks — a new context starts with no implicit memories.
do() requires a tool scope
Because do() can mutate the filesystem or shell state, it refuses to
run unless the tool list is locally readable:
# raises MissingToolScopeError — tools not visible at the call site
await do("Edit src/foo.py")
# OK — explicit at the call site
await do("Edit src/foo.py", tools=[Tool.Read, Tool.Edit])
# OK — explicit at the surrounding scope
async with agent_context(tools=[Tool.Read, Tool.Edit]) as agent:
await agent.do("Edit src/foo.py")
# OK — explicit on the Agent
agent = Agent().with_tools(Tool.Bash)
await agent.do("ls")
check(), get(), get_list() don't carry the same rule — they
return values, not actions.
For a side-effect-free preview, set dry_run=True. This forces
PermissionMode.PLAN (overriding any permission_mode= you passed)
and rejects any tool from MUTATING_TOOLS (Bash, Edit, Write,
MultiEdit, NotebookEdit) at config time:
async with agent_context(tools=[Tool.Read, Tool.Grep], dry_run=True) as agent:
plan = await agent.get(str, "What changes would you make to fix the test?")
Custom or MCP tools (raw strings, not Tool members) pass through
dry_run unclassified — gyre can't tell whether they mutate.
Budget, iterations, parallelism
agent_context accepts three independent caps:
budget_usd: cumulative USD across the session.max_iterations: number of successful calls.max_parallel: in-flight call count.
Limits are checked after each successful call (post-pay, not pre-pay). The call that crosses raises:
BudgetExceededError(cost_usd, budget_usd)—cost_usdis the actual cumulative cost including the failing call.IterationLimitError(iterations, max_iterations).
Both errors leave the session log fully populated, including the
offending call. So agent.log() after the raise tells you exactly
what was attempted.
Costs come from the total_cost_usd field of the claude CLI's JSON
output, accumulated per call — gyre doesn't estimate from token
counts. The numbers are as accurate as the CLI's own cost reporting.
async with agent_context(budget_usd=2.00) as agent:
try:
while not await agent.check("done?"):
await agent.get(Step, "next step")
except BudgetExceededError as e:
print(f"hit ${e.cost_usd:.4f} of ${e.budget_usd:.4f}")
print(agent.log_summary())
max_parallel is implemented as an asyncio.Semaphore around each
subprocess launch, so asyncio.gather(...) over agent calls naturally
respects the cap.
Session log
Every call inside agent_context produces one CallRecord:
@dataclass(frozen=True)
class CallRecord:
kind: str # "check" | "do" | "get" | "get_list" | "raw"
prompt: str
type_name: str | None
schema: dict | None
tools: tuple[str, ...] | None
memory_labels: tuple[str, ...]
session_id: str
cost_usd: float
duration_ms: int
timestamp: datetime
result: str # full CLI result, untruncated — for replay/eval
result_summary: str # truncated to ~500 chars, for `log_summary()`
Inspection:
agent.log() # tuple[CallRecord, ...]
agent.log_summary() # short text report
agent.dump_log("run.json")
result is the full untruncated CLI output, suitable for offline eval
or as the input to a future replay primitive. result_summary is the
500-char display version used by log_summary(). dump_log writes
both, so JSON files can be large for verbose runs — post-process if
that matters for your storage.
Memory contents are not inlined — only labels and sha256 content hashes — so logs stay tractable even with large context files. The hashes let replay distinguish two records with the same memory labels but different content.
Replay
Pass a log path back via replay_from= and the session serves calls
from the log instead of invoking the CLI:
from gyre import agent_context
async with agent_context(replay_from="run.json") as agent:
# Same code path as the original run, but every call is matched
# against the log by content hash and returned without hitting
# Claude. cost_usd, iterations, and the new log mirror the original.
...
Matching is by sha256 of the full call signature: kind, prompt,
schema, sorted tools, ordered (memory_label, memory_content_hash)
pairs, system_prompt, append_prompt, model, permission_mode,
and working_dir. Anything that affected the original output is in
the key; tool order is normalized (it doesn't affect output) and
memory order is preserved (it does).
Identical-input calls bucket FIFO: a gather(...) fanout of N
identical prompts in the original run replays as N matches in
insertion order, and an (N+1)-th call against an N-record log is a
miss.
On a miss, the default is to raise ReplayCacheMissError with a
short summary of the unmatched call. Pass
replay_on_miss="passthrough" to fall through to a live call
instead — useful when you're extending a run with one more step:
async with agent_context(
replay_from="run.json", replay_on_miss="passthrough"
) as agent:
# Existing calls hit cache; one new call goes live.
...
load_log(path) -> tuple[CallRecord, ...] is the lower-level
primitive — useful when you want to read records yourself rather than
replay. Old logs (pre-replay) load with default values for the new
fields and only match calls with those same defaults.
Cost, iteration, and budget tracking apply during replay using the
original costs, so a replayed run hits the same BudgetExceededError
at the same point as the original. max_parallel is not enforced
during replay (the semaphore is a real-call concept).
Top-level vs scoped
There are two ways to call gyre:
agent_context/Agent— the primary surface. Carries tools, memories, budget, parallelism, and the call log.- Top-level
check/do/get/get_list— convenience for one-shot calls outside a loop. Inside anagent_context, they automatically inherit the session via a contextvar, so agather(...)fanout of top-level calls is also budgeted. Outside any context, they run unbounded.
from gyre import Tool, check, do, get, get_list
if await check("Is this Python file syntactically valid?"):
await do("Format with black", tools=[Tool.Bash])
count = await get[int]("How many functions are in main.py?")
todos = await get_list[str]("List all TODO comments")
The get[int]("...") subscript form is the one-liner ergonomic; on
an Agent use the direct form: await agent.get(int, "...").
Reach for agent_context when you have more than one call. Reach for
the top-level form when you have exactly one.
How Gyre compares
If you've used Anthropic's official claude-agent-sdk, the
relationship is easy to describe. The Agent SDK is feature-rich and
calls the Anthropic API via a bundled native binary; Gyre shells out
to your installed claude CLI. The SDK already gives you typed
structured output, max_budget_usd, max_turns,
permission_mode="plan", allowed_tools, and full MCP support — so
for most agentic work driven by an API key, it is what you want.
Gyre exists because four things are absent or awkward in the SDK:
check()as a boolean primitive. The SDK has noawait check("?") -> bool. You'd define a Pydantic model with a singleboolfield and unwrap. Gyre makes the boolean call the unit so loops read aswhile not await agent.check(...).get_list(..., n=N)count constraint. The SDK supports list extraction but not count constraints; you'd addminItems/maxItemsto the schema yourself. Gyre injects them and tells the model the target count in the prompt.- Scoped memory handles. The SDK persists full conversation
history to disk; it has no per-call labeled context blocks. Gyre's
Memoryhandles are explicitly scoped — read fresh on each call, attached to the calls you choose, never the whole session buffer. max_parallelcap. The SDK does not expose a concurrency cap. Gyre wraps each subprocess launch in anasyncio.Semaphoresogather(...)fanout naturally respects the limit.- Hash-cache replay. The SDK persists JSONL session histories but
has no replay primitive. Gyre's
agent_context(replay_from=path)matches calls against a dumped log by content hash of the full signature; misses raise by default, and identical-input duplicates bucket FIFO so parallel fanouts replay correctly.
There's also the architectural split. Gyre drives the CLI you've
already authenticated, so it inherits whatever auth your claude
install uses. The Agent SDK runs its own bundled binary and reads
ANTHROPIC_API_KEY. Pick on auth and primitive needs, not feature
count.
For pure structured-output libraries — Instructor, BAML — Gyre is in a different category. Those validate single-call typed responses; Gyre is built around the loop those calls live in.
Errors
Gyre raises a small hierarchy under GyreError:
AgentTimeoutError,AgentExecutionError— subprocess-level.TypeExtractionError,SchemaValidationError— response parsing.BudgetExceededError,IterationLimitError— session limits.MissingToolScopeError—do()called without a visible tool list.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gyre-0.1.0.tar.gz.
File metadata
- Download URL: gyre-0.1.0.tar.gz
- Upload date:
- Size: 77.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a776349f603c5876c268c1450d3d36f399cdbc9eafb0927812af07912696a1ed
|
|
| MD5 |
17175d0e6600b1e15f05fb2da448e32e
|
|
| BLAKE2b-256 |
80657d4b6b3d25fc8f8c9e6f3594c6d77c89a0f8c952634232c720d999bd5ef5
|
File details
Details for the file gyre-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gyre-0.1.0-py3-none-any.whl
- Upload date:
- Size: 35.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0ba37e89ff9fecc06afa025839c49c2486760f162ae82bbc06042ecab4aee52
|
|
| MD5 |
5436dacca3fbc3ef9e0010317a4038ac
|
|
| BLAKE2b-256 |
f3e4782314aa8a071627bbc6893b4852274bd7c380d282eba650b3347703a1a8
|