Production-grade agent harness for building autonomous AI agents
Project description
Curio Agent SDK
A composable, async-first, production-grade agent harness for building AI agents — from simple tool-calling bots to coding agents, deep research systems, computer-use agents, and multi-agent orchestrations.
v0.6.0 — Every component is independently usable and replaceable. Convention over configuration. Zero mandatory dependencies beyond Python stdlib + httpx + python-dotenv.
Quick Start
5-Line Agent
from curio_agent_sdk import Agent, tool
@tool
def search(query: str) -> str:
"""Search the web for information."""
return "Results for: " + query
agent = Agent(
model="openai:gpt-4o",
tools=[search],
system_prompt="You are a helpful research assistant.",
)
result = agent.run("What are the latest developments in quantum computing?")
print(result.output)
Builder Pattern
from curio_agent_sdk import Agent, SubagentConfig, AllowReadsAskWrites
agent = Agent.builder() \
.model("anthropic:claude-sonnet-4-6") \
.system_prompt("You are a research agent.") \
.tools([search, calculator]) \
.instructions("Always cite sources.") \
.instructions_file("./AGENT.md") \
.hook("tool.call.before", lambda ctx: print(f"Calling {ctx.data['tool']}")) \
.subagent("researcher", SubagentConfig(
system_prompt="Research specialist",
tools=[web_search],
)) \
.permissions(AllowReadsAskWrites()) \
.build()
Full Configuration
from curio_agent_sdk import (
Agent, ToolCallingLoop, LLMClient, TieredRouter,
CostTracker, GuardrailsMiddleware, MemoryManager,
ConversationMemory, FileStateStore, ContextManager,
CLIHumanInput, SessionManager, InMemorySessionStore,
TaskManager, InstructionLoader, AllowReadsAskWrites,
HookRegistry, InMemoryEventBus,
)
agent = Agent(
loop=ToolCallingLoop(tier="tier3"),
llm=LLMClient(router=TieredRouter(), dedup_enabled=True),
tools=[search, calculator, fetch_data],
system_prompt="You are a research agent.",
agent_id="research-agent",
max_iterations=25,
timeout=300,
context_manager=ContextManager(max_tokens=128000),
middleware=[
CostTracker(budget=1.00, alert_thresholds=[0.5, 0.8]),
GuardrailsMiddleware(
block_patterns=[r"(?i)password"],
block_prompt_injection=True,
),
],
memory_manager=MemoryManager(memory=ConversationMemory(max_entries=50)),
state_store=FileStateStore("./state"),
session_manager=SessionManager(InMemorySessionStore()),
human_input=CLIHumanInput(),
instruction_loader=InstructionLoader(),
permission_policy=AllowReadsAskWrites(),
event_bus=InMemoryEventBus(),
)
async with agent:
result = await agent.arun("Research quantum computing advances")
print(f"Output: {result.output}")
Installation
pip install -e .
# Provider-specific extras
pip install -e ".[openai]" # OpenAI + tiktoken
pip install -e ".[anthropic]" # Anthropic
pip install -e ".[all]" # All providers + optional deps
# Or install providers directly
pip install openai anthropic groq
# Optional extras
pip install pydantic>=2.0 # Structured output
pip install pyautogui # Computer use tools
pip install playwright # Browser automation
# Development
pip install -e ".[dev]" # pytest, black, isort, mypy, ruff
Requires Python 3.11+
Configuration
Environment Variables
# Provider API keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GROQ_API_KEY=gsk_...
# Tier configuration (optional — auto-detected from available keys)
TIER1_MODELS=groq:llama-3.1-8b-instant,openai:gpt-4o-mini
TIER2_MODELS=openai:gpt-4o,anthropic:claude-sonnet-4-6
TIER3_MODELS=anthropic:claude-sonnet-4-6,openai:gpt-4o
Model String Format
agent = Agent(model="openai:gpt-4o", ...)
agent = Agent(model="anthropic:claude-sonnet-4-6", ...)
agent = Agent(model="groq:llama-3.3-70b-versatile", ...)
agent = Agent(model="ollama:llama3.1:8b", ...)
Architecture
Uses a src layout. Install with pip install -e . from the repo root.
curio_agent_sdk/
├── src/
│ └── curio_agent_sdk/
│ ├── __init__.py # Public API (re-exports everything)
│ ├── base/
│ │ └── component.py # Component ABC (startup/shutdown/health_check)
│ ├── core/
│ │ ├── agent/
│ │ │ ├── agent.py # Agent (thin shell)
│ │ │ ├── builder.py # AgentBuilder (fluent API)
│ │ │ └── runtime.py # Runtime (orchestration engine)
│ │ ├── state/
│ │ │ ├── state.py # AgentState + typed extensions
│ │ │ ├── state_store.py # StateStore ABC + InMemory/File implementations
│ │ │ ├── checkpoint.py # Checkpoint serialization
│ │ │ └── session.py # SessionManager + SessionStore
│ │ ├── context/
│ │ │ ├── context.py # ContextManager (token budgets)
│ │ │ └── instructions.py # InstructionLoader (AGENT.md)
│ │ ├── events/
│ │ │ ├── hooks.py # HookRegistry + HookContext
│ │ │ └── event_bus.py # EventBus + InMemoryEventBus
│ │ ├── extensions/
│ │ │ ├── plugins.py # Plugin ABC + entry-point discovery
│ │ │ ├── skills.py # Skill + SkillRegistry
│ │ │ └── subagent.py # SubagentConfig + AgentOrchestrator
│ │ ├── workflow/
│ │ │ ├── plan_mode.py # PlanMode + TodoManager
│ │ │ ├── task_manager.py # TaskManager (background tasks)
│ │ │ └── structured_output.py # Pydantic structured output
│ │ ├── security/
│ │ │ ├── permissions.py # PermissionPolicy + sandboxing
│ │ │ └── human_input.py # Human-in-the-loop
│ │ ├── loops/
│ │ │ ├── base.py # AgentLoop ABC
│ │ │ └── tool_calling.py # Standard tool calling loop
│ │ ├── tools/
│ │ │ ├── tool.py # Tool class + @tool decorator
│ │ │ ├── schema.py # ToolSchema (JSON Schema from type hints)
│ │ │ ├── registry.py # ToolRegistry
│ │ │ └── executor.py # Async ToolExecutor
│ │ └── llm/
│ │ ├── client.py # LLMClient (async, dedup, batch)
│ │ ├── router.py # TieredRouter + degradation strategies
│ │ ├── token_counter.py # Token counting (cached)
│ │ ├── batch_client.py # BatchLLMClient
│ │ └── providers/
│ │ ├── base.py # LLMProvider ABC
│ │ ├── openai.py # OpenAI (tools, streaming, vision)
│ │ ├── anthropic.py # Anthropic (tools, streaming, cache)
│ │ ├── groq.py # Groq (OpenAI-compatible)
│ │ └── ollama.py # Ollama (on-premise, OpenAI-compatible)
│ ├── models/
│ │ ├── llm.py # Message, ToolCall, LLMRequest/Response
│ │ ├── agent.py # AgentRun, AgentRunResult, AgentRunStatus
│ │ ├── events.py # EventType, StreamEvent, AgentEvent
│ │ └── exceptions.py # Custom exception hierarchy
│ ├── middleware/
│ │ ├── base.py # Middleware ABC + MiddlewarePipeline
│ │ ├── logging_mw.py # Structured logging
│ │ ├── cost_tracker.py # Cost tracking, budgets, alerts
│ │ ├── rate_limit.py # Per-user/agent rate limiting
│ │ ├── tracing.py # OpenTelemetry tracing + metrics
│ │ ├── guardrails.py # Content safety, PII, injection blocking
│ │ ├── consumers.py # Hook-based observability consumers
│ │ └── prometheus.py # Prometheus/Grafana export
│ ├── memory/
│ │ ├── base.py # Memory ABC
│ │ ├── manager.py # MemoryManager + injection/save/query strategies
│ │ ├── conversation.py # Sliding window memory
│ │ ├── vector.py # Semantic search (embeddings)
│ │ ├── key_value.py # Key-value store
│ │ ├── composite.py # Combine multiple memory types
│ │ ├── working.py # Ephemeral scratchpad
│ │ ├── episodic.py # Temporal experience memory
│ │ ├── graph.py # Entity-relationship knowledge graph
│ │ ├── self_editing.py # MemGPT/Letta-style core + archival
│ │ ├── file_memory.py # File-based persistent memory
│ │ └── policies.py # Decay, importance, summarization
│ ├── persistence/
│ │ ├── base.py # BasePersistence ABC + audit logs
│ │ ├── audit_hooks.py # register_audit_hooks
│ │ ├── sqlite.py # SQLite backend
│ │ ├── postgres.py # PostgreSQL backend
│ │ └── memory.py # In-memory backend
│ ├── mcp/
│ │ ├── client.py # MCPClient (stdio + HTTP)
│ │ ├── transport.py # StdioTransport, HTTPTransport
│ │ ├── config.py # MCPServerConfig, load from file
│ │ ├── adapter.py # MCP → Curio Tool adapter
│ │ └── bridge.py # MCPBridge (Component lifecycle)
│ ├── connectors/
│ │ ├── base.py # Connector ABC + ConnectorResource
│ │ └── bridge.py # ConnectorBridge (Component lifecycle)
│ ├── credentials/
│ │ └── credentials.py # CredentialResolver (Vault, AWS, env)
│ ├── resilience/
│ │ └── circuit_breaker.py # CircuitBreaker
│ ├── cli/
│ │ └── cli.py # AgentCLI interactive harness
│ ├── tools/ # Built-in tools
│ │ ├── web.py # web_fetch
│ │ ├── code.py # python_execute, shell_execute
│ │ ├── file.py # file_read, file_write
│ │ ├── http.py # http_request
│ │ ├── computer_use.py # ComputerUseToolkit
│ │ └── browser.py # BrowserToolkit (Playwright)
│ └── testing/ # Testing utilities
│ ├── mock_llm.py # MockLLM, text_response, tool_call_response
│ ├── harness.py # AgentTestHarness
│ ├── fixtures.py # Pytest fixtures
│ ├── coverage.py # AgentCoverageTracker
│ ├── replay.py # RecordingMiddleware, ReplayLLMClient
│ ├── toolkit.py # ToolTestKit
│ ├── integration.py # MultiAgentTestHarness
│ ├── snapshot.py # SnapshotTester
│ ├── benchmark.py # BenchmarkSuite
│ ├── eval.py # AgentEvalSuite
│ └── regression.py # RegressionDetector
├── tests/
│ ├── unit/ # Unit tests (tools, llm, middleware, memory, etc.)
│ ├── integration/ # Integration tests (25 test files)
│ ├── e2e/ # End-to-end tests
│ ├── performance/ # Performance/stress tests
│ └── conftest.py # Shared pytest fixtures
├── docs/
├── pyproject.toml
├── Makefile
├── README.md
└── LICENSE
117 source files | 145 test files | 4 LLM providers | 9 memory types | 7 middleware | 10 built-in tools
Core Concepts
Agent, Runtime, and Builder
The Agent is a thin shell. The Runtime handles all orchestration. The AgentBuilder provides a fluent API for construction.
from curio_agent_sdk import Agent
# Simple constructor
agent = Agent(model="openai:gpt-4o", tools=[search])
# Builder pattern (recommended for complex agents)
agent = Agent.builder() \
.model("openai:gpt-4o") \
.tools([search]) \
.system_prompt("You are helpful.") \
.build()
# Direct Runtime access for advanced use
state = agent.runtime.create_state("Do something")
result = await agent.runtime.run_with_state(state)
Tools
Define tools with the @tool decorator. JSON schemas are auto-generated from type hints.
from curio_agent_sdk import tool, ToolConfig
@tool
def calculate(expression: str) -> str:
"""Evaluate a math expression."""
return str(eval(expression))
@tool(timeout=30, retries=2, require_confirmation=True, cache_ttl=60)
async def fetch_data(url: str) -> str:
"""Fetch data from a URL."""
async with httpx.AsyncClient() as client:
return (await client.get(url)).text
# Idempotent tools skip re-execution on checkpoint restore
@tool(config=ToolConfig(idempotent=True))
async def write_file(path: str, content: str) -> str:
"""Write content to a file."""
...
Built-in Tools
from curio_agent_sdk.tools import web_fetch, file_read, file_write, http_request, python_execute
from curio_agent_sdk import ComputerUseToolkit, BrowserToolkit
# Standard tools
agent = Agent(tools=[web_fetch, file_read, http_request], ...)
# Computer use (pip install curio-agent-sdk[computer-use])
agent = Agent(tools=ComputerUseToolkit().get_tools(), ...)
# Browser automation (pip install curio-agent-sdk[browser] && playwright install)
browser = BrowserToolkit(headless=True)
agent = Agent(tools=browser.get_tools(), ...)
Hooks / Lifecycle System
Hooks let you customize agent behavior at every lifecycle point — without subclassing. Hooks are mutable: they can modify context, cancel actions, and inject data.
# Block dangerous tool calls
agent = Agent.builder() \
.hook("tool.call.before", lambda ctx: ctx.cancel() if ctx.data["tool"] == "rm" else None) \
.hook("llm.call.after", lambda ctx: log_response(ctx.data["response"])) \
.hook("agent.run.after", lambda ctx: notify_slack("Agent completed")) \
.build()
# All hook events:
# agent.run.before/after/error, agent.iteration.before/after,
# llm.call.before/after/error, tool.call.before/after/error,
# memory.inject.before, memory.save.before,
# state.checkpoint.before/after
# Load hooks from config file (YAML/TOML)
# hooks.yaml:
# - event: tool.call.before
# handler: my_module:validate_tool_call
# - event: agent.run.after
# shell: "echo 'Done' >> /tmp/agent.log"
Rules / Instructions
Hierarchical instruction loading — like CLAUDE.md or .cursorrules. Global > project > directory.
from curio_agent_sdk import InstructionLoader
# Auto-loads AGENT.md and .agent/rules.md from global, project, and cwd
agent = Agent.builder() \
.instructions(InstructionLoader()) \
.build()
# Or manual
agent = Agent.builder() \
.instructions("Always respond in JSON format.") \
.instructions_file("./AGENT.md") \
.build()
# Dynamic injection mid-session
agent.add_instructions("From now on, prefer short answers.")
Skills
Packaged, reusable agent capabilities — bundle prompts, tools, and hooks into named skills.
from curio_agent_sdk import Skill
commit_skill = Skill(
name="commit",
description="Create well-formatted git commits",
system_prompt="When committing, analyze changes...",
tools=[git_status, git_diff, git_add, git_commit],
)
agent = Agent.builder() \
.skill(commit_skill) \
.skill(Skill.from_directory("./skills/review-pr")) \
.build()
# Invoke a skill
result = await agent.invoke_skill("commit", "Commit the auth changes")
# Activate/deactivate skills mid-run
await agent.arun("Plan the feature", active_skills=["planning"])
Subagents / Multi-Agent Orchestration
Spawn specialized subagents, run them in the background, or hand off conversations.
from curio_agent_sdk import Agent, SubagentConfig
agent = Agent.builder() \
.model("anthropic:claude-sonnet-4-6") \
.subagent("researcher", SubagentConfig(
system_prompt="Research specialist",
tools=[web_search, fetch_page],
model="openai:gpt-4o",
)) \
.subagent("coder", SubagentConfig(
system_prompt="Expert programmer",
tools=[read_file, edit_file, run_tests],
inherit_memory=True,
)) \
.build()
# Subagents are available as tools — the parent agent can spawn them
# Or programmatically:
result = await agent.spawn_subagent("researcher", "Find papers on transformers")
task_id = await agent.spawn_subagent_background("coder", "Implement the feature")
result = await agent.get_subagent_result(task_id)
# Handoff conversation to another agent
await agent.handoff(other_agent, "Continue this analysis")
Plan Mode & Todos
Plan-then-execute workflows with task tracking and approval gates.
agent = Agent.builder() \
.model("openai:gpt-4o") \
.tools([read_file, edit_file, run_tests]) \
.plan_mode(read_only_tool_names=["read_file"]) \
.build()
# Agent can enter plan mode (restricts to read-only tools),
# design a plan, exit with plan for approval, then execute.
# Todos are tracked as part of agent state and persisted in checkpoints.
if agent.is_awaiting_plan_approval():
plan = agent.get_plan()
print(plan)
Structured Output
Get validated Pydantic models back from agent runs.
from pydantic import BaseModel
class SearchResult(BaseModel):
title: str
url: str
summary: str
result = await agent.arun(
"Find 3 papers on transformers",
response_format=list[SearchResult],
)
for paper in result.parsed_output:
print(f"{paper.title}: {paper.url}")
Memory System
9 memory types with pluggable strategies for injection, saving, and querying.
from curio_agent_sdk import (
MemoryManager, ConversationMemory, CompositeMemory,
VectorMemory, KeyValueMemory, WorkingMemory,
EpisodicMemory, GraphMemory, SelfEditingMemory, FileMemory,
)
# Simple conversation memory
agent = Agent(memory_manager=MemoryManager(memory=ConversationMemory(max_entries=50)), ...)
# Composite memory with multiple backends
memory = CompositeMemory({
"conversation": ConversationMemory(max_entries=50),
"knowledge": KeyValueMemory(),
"semantic": VectorMemory(persist_path="./vectors"),
})
# MemGPT/Letta-style self-editing memory (agent manages its own memory via tools)
memory = SelfEditingMemory()
agent = Agent(
memory_manager=MemoryManager(memory=memory),
tools=memory.get_tools(), # core_memory_read/write, archival_search/insert
...
)
# File-based persistent memory (Claude Code style)
memory = FileMemory(base_path="./memory", namespace="project-x")
# Pluggable strategies
from curio_agent_sdk.memory.manager import SaveSummaryStrategy
manager = MemoryManager(
memory=memory,
injection_strategy=UserMessageInjection(),
save_strategy=SaveSummaryStrategy(summarize_fn=my_summarizer),
query_strategy=AdaptiveTokenQuery(),
)
Memory types: ConversationMemory (sliding window), VectorMemory (semantic search), KeyValueMemory, CompositeMemory (combine backends), WorkingMemory (ephemeral scratchpad), EpisodicMemory (temporal), GraphMemory (entity-relationship), SelfEditingMemory (MemGPT-style), FileMemory (persistent files).
Sessions / Conversations
Persistent multi-turn conversations with session management.
from curio_agent_sdk import SessionManager, InMemorySessionStore
session_mgr = SessionManager(InMemorySessionStore())
agent = Agent.builder().session_manager(session_mgr).model("openai:gpt-4o").build()
session = await session_mgr.create(agent.agent_id)
result = await agent.arun("Hello!", session_id=session.id)
# Messages are automatically persisted
result = await agent.arun("Follow up question", session_id=session.id)
# Agent has full conversation history
MCP (Model Context Protocol)
Connect to MCP servers for dynamic tool discovery — supports stdio and HTTP transports, Cursor/Claude-style config, and env var credential resolution.
# From URL
agent = Agent.builder() \
.mcp_server("stdio://npx -y @modelcontextprotocol/server-filesystem /path") \
.mcp_server("http://localhost:8080") \
.build()
# From config (Cursor/Claude style) with credential resolution
agent = Agent.builder() \
.mcp_server_config({
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_PERSONAL_ACCESS_TOKEN": "$GITHUB_TOKEN"},
}) \
.build()
# From config file (mcpServers format)
agent = Agent.builder() \
.mcp_servers_from_file("mcp.json") \
.build()
# MCP tools are discovered at startup
async with agent:
result = await agent.arun("List my GitHub repos")
Connectors
Pluggable connector framework for external services. Implement the Connector ABC and tools are auto-discovered at startup.
from curio_agent_sdk import Connector
class MyAPIConnector(Connector):
name = "my-api"
async def connect(self, credentials=None): ...
async def disconnect(self): ...
def get_tools(self): return [self.query_api, self.submit_data]
...
agent = Agent.builder() \
.connector(MyAPIConnector(api_key="$MY_API_KEY")) \
.build()
Long-Running Tasks
Background execution, pause/resume, progress tracking, and concurrency limits.
from curio_agent_sdk import TaskManager
task_mgr = TaskManager(max_concurrent=2)
# Submit background task
task_id = await task_mgr.submit(agent, "Comprehensive analysis of X")
# Track progress
task_mgr.on_progress(task_id, lambda rid, i, max_i: print(f"{i}/{max_i}"))
# Pause and resume
await task_mgr.pause(task_id)
await task_mgr.resume(task_id)
# Wait for completion
result = await task_mgr.wait(task_id, timeout=600)
# Crash recovery — find and resume interrupted runs
recovered = await task_mgr.recover_incomplete(agent, "Continue task")
Permissions & Sandboxing
Composable permission policies with file system and network sandboxing.
from curio_agent_sdk import (
AllowAll, AskAlways, AllowReadsAskWrites,
CompoundPolicy, FileSandboxPolicy, NetworkSandboxPolicy,
)
agent = Agent.builder() \
.permissions(CompoundPolicy([
AllowReadsAskWrites(),
FileSandboxPolicy(["/workspace", "/tmp"]),
NetworkSandboxPolicy(["https://api.github.com/*"]),
])) \
.build()
Middleware
Pipeline-based middleware for cross-cutting concerns.
from curio_agent_sdk import (
CostTracker, GuardrailsMiddleware, RateLimitMiddleware,
)
agent = Agent(
middleware=[
CostTracker(
budget=10.00,
alert_thresholds=[0.5, 0.8, 0.95],
on_threshold=lambda pct, cost: notify(f"At {pct*100}%: ${cost:.2f}"),
),
GuardrailsMiddleware(
block_patterns=[r"(?i)password"],
block_input_patterns=[r"(?i)ignore previous"],
block_prompt_injection=True,
),
RateLimitMiddleware(requests_per_minute=60),
],
...
)
Streaming
async for event in agent.astream("Tell me about quantum computing"):
if event.type == "text_delta":
print(event.text, end="", flush=True)
elif event.type == "tool_call_start":
print(f"\n[Calling {event.tool_name}...]")
elif event.type == "thinking":
print(f"[Thinking: {event.text}]")
elif event.type == "done":
print("\n[Done]")
Event Bus
Distributed event streaming with pub/sub, replay, and dead letter queues.
from curio_agent_sdk import InMemoryEventBus
bus = InMemoryEventBus()
agent = Agent.builder().event_bus(bus).model("openai:gpt-4o").build()
# Subscribe with glob patterns
await bus.subscribe("tool.call.*", lambda e: print(f"Tool: {e.data}"))
await bus.subscribe("*.error", lambda e: print(f"Error: {e.data}"))
await bus.subscribe("*", audit_logger)
# Replay events from a timestamp
async for event in bus.replay(start_time, pattern="llm.call.*"):
print(event.to_dict())
# Dead letter inspection
for entry in bus.dead_letters:
print(f"Failed: {entry.handler} — {entry.error}")
Plugins
Distributable plugin packages with auto-discovery via entry points.
from curio_agent_sdk import Plugin
class MyPlugin(Plugin):
name = "my-plugin"
version = "1.0.0"
def register(self, builder):
builder.tool(my_tool)
builder.hook("agent.run.before", my_hook)
agent = Agent.builder() \
.plugin(MyPlugin()) \
.discover_plugins() \
.build()
Checkpointing & Resume
from curio_agent_sdk import FileStateStore
agent = Agent(state_store=FileStateStore("./state"), checkpoint_interval=1, ...)
result = await agent.arun("Long task...")
result = await agent.arun("Continue...", resume_from=result.run_id)
Component Lifecycle
All stateful components implement Component with startup(), shutdown(), and health_check(). The agent manages lifecycle automatically.
async with agent:
result = await agent.arun("Do something")
health = await agent.runtime.system_health()
# {"healthy": True, "components": {"MemoryManager": True, "MCPBridge": True, ...}}
CLI Harness
Build interactive CLI agents with streaming, commands, and session persistence.
from curio_agent_sdk import AgentCLI
cli = AgentCLI(agent)
cli.register_command("/deploy", deploy_handler)
await cli.run_interactive() # REPL with /help, /clear, /status, /sessions, /skills, /exit
Testing
Comprehensive testing utilities: mocking, record/replay, snapshots, benchmarks, coverage, and evals.
from curio_agent_sdk.testing import (
MockLLM, AgentTestHarness, ToolTestKit,
RecordingMiddleware, ReplayLLMClient,
BenchmarkSuite, AgentEvalSuite, AgentCoverageTracker,
)
# Mock-based testing
mock = MockLLM()
mock.add_response(tool_call_response("calculate", {"expression": "2+2"}))
mock.add_text_response("2 + 2 = 4")
harness = AgentTestHarness(agent, llm=mock)
result = harness.run_sync("What is 2+2?")
assert result.status == "completed"
assert harness.tool_calls == [("calculate", {"expression": "2+2"})]
# Tool-level testing
kit = ToolTestKit()
kit.mock_tool("read_file", returns="file content")
kit.assert_tool_called("read_file", path="test.py")
kit.assert_call_order(["read_file", "write_file"])
# Record/replay (capture real runs, replay deterministically)
recorder = RecordingMiddleware()
recorder.save("tests/fixtures/run.json")
replay = ReplayLLMClient.from_file("tests/fixtures/run.json")
# Evals and regression detection
eval_suite = AgentEvalSuite(agent=agent, dataset=[...], metrics=[...])
results = await eval_suite.run()
# Pytest fixtures (add to conftest.py)
# pytest_plugins = ["curio_agent_sdk.testing.fixtures"]
# Provides: mock_llm, agent_test_harness, tool_test_kit, in_memory_state_store, etc.
Reliability & Production
Circuit Breakers & Tiered Routing
from curio_agent_sdk import TieredRouter, FallbackToLowerTier
router = TieredRouter(
tier1=["groq:llama-3.1-8b-instant"],
tier2=["openai:gpt-4o"],
tier3=["anthropic:claude-sonnet-4-6"],
degradation_strategy=FallbackToLowerTier(),
)
Request Deduplication
from curio_agent_sdk import LLMClient
client = LLMClient(router=router, dedup_enabled=True, dedup_ttl=30.0)
Credential Management
from curio_agent_sdk.credentials import VaultCredentialResolver, AWSSecretsResolver
vault = VaultCredentialResolver("https://vault:8200", token="...")
aws = AWSSecretsResolver(region="us-east-1")
Observability
Hook-based observability consumers for unified event-driven monitoring.
from curio_agent_sdk import TracingConsumer, LoggingConsumer, PersistenceConsumer, PrometheusExporter
agent = Agent.builder() \
.hook("llm.call.after", TracingConsumer(tracer)) \
.hook("llm.call.after", LoggingConsumer(logger)) \
.hook("*", PrometheusExporter()) \
.build()
Persistence & Audit Logs
from curio_agent_sdk.persistence import SQLitePersistence, PostgresPersistence, InMemoryPersistence
from curio_agent_sdk.persistence import register_audit_hooks
persistence = SQLitePersistence("./audit.db")
register_audit_hooks(agent.hook_registry, persistence)
Tier System
Three tiers for different task complexities with automatic failover:
| Tier | Purpose | Default Models |
|---|---|---|
| tier1 | Fast, simple tasks | gpt-4o-mini, llama-3.1-8b |
| tier2 | Balanced quality/speed | gpt-4o, claude-sonnet-4-6 |
| tier3 | High-quality output | claude-sonnet-4-6, gpt-4o |
Running Tests
make test # All tests (except live and slow)
make test-unit # Unit tests only
make test-integration # Integration tests
make test-e2e # End-to-end tests
make test-perf # Performance/stress tests
make test-live # Live API tests (requires API keys)
make test-all # Everything
make test-cov # With coverage report
make test-fast # Fast unit tests, stop on first failure
License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file curio_agent_sdk-0.6.0.tar.gz.
File metadata
- Download URL: curio_agent_sdk-0.6.0.tar.gz
- Upload date:
- Size: 226.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea91267feb88d5d9ea64b7b9a44e012a2bc832d0d129e128a5a18d7951fc7099
|
|
| MD5 |
bd3571c9e283b424ad75f717dfd90670
|
|
| BLAKE2b-256 |
dd7a83b2c01943d54fb92c90f3ec16b7390cf9a7b7fd38bb211c546bc35e9c63
|
File details
Details for the file curio_agent_sdk-0.6.0-py3-none-any.whl.
File metadata
- Download URL: curio_agent_sdk-0.6.0-py3-none-any.whl
- Upload date:
- Size: 272.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea876b472e15b06a7296341b1c105a17109ed9f564efa19f625f76fd57cec325
|
|
| MD5 |
cbda862433007a6a38cd8862b8775bba
|
|
| BLAKE2b-256 |
7b2e35b0de4740916407fd7c44c7c6a99e5bb7dc3c801a8cc435adce3d46a2ac
|