Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints
Project description
Open Agent SDK
Build production-ready AI agents in minutes using your own hardware
What you can build:
- Copy editors that analyze manuscripts and track writing patterns
- Git commit generators that write meaningful commit messages
- Market analyzers that research competitors and summarize findings
- Code reviewers, data analysts, research assistants, and more
Why local?
- No API costs - use your hardware, not OpenAI's
- Privacy - your data never leaves your machine
- Control - pick your model (Qwen, Llama, Mistral, etc.)
How fast? From zero to working agent in under 5 minutes. Familiar patterns (inspired by Claude SDK), batteries-included features (streaming, tools, hooks, auto-execution), and production-ready quality.
Overview
Open Agent SDK provides a clean, streaming API for working with OpenAI-compatible local model servers. Drop-in similar patterns to popular SDKs, working with LM Studio, Ollama, llama.cpp, and any OpenAI-compatible endpointโcomplete with streaming, tool call aggregation, hooks, and automatic tool execution.
Supported Providers
โ Supported (OpenAI-Compatible Endpoints)
- LM Studio -
http://localhost:1234/v1 - Ollama -
http://localhost:11434/v1 - llama.cpp server - OpenAI-compatible mode
- vLLM - OpenAI-compatible API
- Text Generation WebUI - OpenAI extension
- Any OpenAI-compatible local endpoint
- Local gateways proxying cloud models - e.g., Ollama or custom gateways that route to cloud providers
โ Not Supported (Use Official SDKs)
- Claude/OpenAI direct - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway
- Cloud provider SDKs - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)
Quick Start
Installation
pip install open-agent-sdk
For development:
git clone https://github.com/slb350/open-agent-sdk.git
cd open-agent-sdk
pip install -e .
Simple Query (LM Studio)
import asyncio
from open_agent import query, AgentOptions
async def main():
options = AgentOptions(
system_prompt="You are a professional copy editor",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
max_turns=1,
temperature=0.1
)
result = query(prompt="Analyze this text...", options=options)
response_text = ""
async for msg in result:
if hasattr(msg, 'content'):
for block in msg.content:
if hasattr(block, 'text'):
response_text += block.text
print(response_text)
asyncio.run(main())
Multi-Turn Conversation (Ollama)
from open_agent import Client, AgentOptions, TextBlock, ToolUseBlock
from open_agent.config import get_base_url
def run_my_tool(name: str, params: dict) -> dict:
# Replace with your tool execution logic
return {"result": f"stub output for {name}"}
async def main():
options = AgentOptions(
system_prompt="You are a helpful assistant",
model="kimi-k2:1t-cloud", # Use your available Ollama model
base_url=get_base_url(provider="ollama"),
max_turns=10
)
async with Client(options) as client:
await client.query("What's the capital of France?")
async for msg in client.receive_messages():
if isinstance(msg, TextBlock):
print(f"Assistant: {msg.text}")
elif isinstance(msg, ToolUseBlock):
print(f"Tool used: {msg.name}")
tool_result = run_my_tool(msg.name, msg.input)
client.add_tool_result(msg.id, tool_result)
asyncio.run(main())
See examples/tool_use_agent.py for progressively richer patterns (manual loop, helper function, and reusable agent class) demonstrating add_tool_result() in context.
Function Calling with Tools
Define tools using the @tool decorator for clean, type-safe function calling:
from open_agent import tool, Client, AgentOptions, TextBlock, ToolUseBlock
# Define tools
@tool("get_weather", "Get current weather", {"location": str, "units": str})
async def get_weather(args):
return {
"temperature": 72,
"conditions": "sunny",
"units": args["units"]
}
@tool("calculate", "Perform calculation", {"a": float, "b": float, "op": str})
async def calculate(args):
ops = {"+": lambda a, b: a + b, "-": lambda a, b: a - b}
result = ops[args["op"]](args["a"], args["b"])
return {"result": result}
# Enable automatic tool execution (recommended)
options = AgentOptions(
system_prompt="You are a helpful assistant with access to tools.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[get_weather, calculate],
auto_execute_tools=True, # ๐ฅ Tools execute automatically
max_tool_iterations=10 # Safety limit for tool loops
)
async with Client(options) as client:
await client.query("What's 25 + 17?")
# Simply iterate - tools execute automatically!
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
print(f"Tool called: {block.name}")
elif isinstance(block, TextBlock):
print(f"Response: {block.text}")
Advanced: Manual Tool Execution
For custom execution logic or result interception:
# Disable auto-execution
options = AgentOptions(
system_prompt="You are a helpful assistant with access to tools.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[get_weather, calculate],
auto_execute_tools=False # Manual mode
)
async with Client(options) as client:
await client.query("What's 25 + 17?")
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
# You execute the tool manually
tool = {"calculate": calculate, "get_weather": get_weather}[block.name]
result = await tool.execute(block.input)
# Return result to agent
await client.add_tool_result(block.id, result)
# Continue conversation
await client.query("")
Key Features:
- Automatic execution (v0.3.0+) - Tools run automatically with safety limits
- Type-safe schemas - Simple Python types (
str,int,float,bool) or full JSON Schema - OpenAI-compatible - Works with any OpenAI function calling endpoint
- Clean decorator API - Similar to Claude SDK's
@tool - Hook integration - PreToolUse/PostToolUse hooks work in both modes
See examples/calculator_tools.py and examples/simple_tool.py for complete examples.
Context Management
Local models have fixed context windows (typically 8k-32k tokens). The SDK provides opt-in utilities for manual history managementโno silent mutations, you stay in control.
Token Estimation & Truncation
from open_agent import Client, AgentOptions
from open_agent.context import estimate_tokens, truncate_messages
async with Client(options) as client:
# Long conversation...
for i in range(50):
await client.query(f"Question {i}")
async for msg in client.receive_messages():
pass
# Check token usage
tokens = estimate_tokens(client.history)
print(f"Context size: ~{tokens} tokens")
# Manually truncate when needed
if tokens > 28000:
client.message_history = truncate_messages(client.history, keep=10)
Recommended Patterns
1. Stateless Agents (Best for single-task agents):
# Process each task independently - no history accumulation
for task in tasks:
async with Client(options) as client:
await client.query(task)
# Client disposed, fresh context for next task
2. Manual Truncation (At natural breakpoints):
from open_agent.context import truncate_messages
async with Client(options) as client:
for task in tasks:
await client.query(task)
# Truncate after each major task
client.message_history = truncate_messages(client.history, keep=5)
3. External Memory (RAG-lite for research agents):
# Store important facts in database, keep conversation context small
database = {}
async with Client(options) as client:
await client.query("Research topic X")
# Save response to database
database["topic_x"] = extract_facts(response)
# Clear history, query database when needed
client.message_history = truncate_messages(client.history, keep=0)
Why Manual?
The SDK intentionally does not auto-compact history because:
- Domain-specific needs: Copy editors need different strategies than research agents
- Token accuracy varies: Each model family has different tokenizers
- Risk of breaking context: Silently removing messages could break tool chains
- Natural limits exist: Compaction doesn't bypass model context windows
Installing Token Estimation
For better token estimation accuracy (optional):
pip install open-agent-sdk[context] # Adds tiktoken
Without tiktoken, falls back to character-based approximation (~75-85% accurate).
See examples/context_management.py for complete patterns and usage.
Lifecycle Hooks
Monitor and control agent behavior at key execution points with Pythonic lifecycle hooksโno subprocess overhead or JSON protocols.
Quick Example
from open_agent import (
AgentOptions, Client,
PreToolUseEvent, PostToolUseEvent, UserPromptSubmitEvent,
HookDecision,
HOOK_PRE_TOOL_USE, HOOK_POST_TOOL_USE, HOOK_USER_PROMPT_SUBMIT
)
# Security gate - block dangerous operations
async def security_gate(event: PreToolUseEvent) -> HookDecision | None:
if event.tool_name == "delete_file":
return HookDecision(
continue_=False,
reason="Delete operations require approval"
)
return None # Allow by default
# Audit logger - track all tool executions
async def audit_logger(event: PostToolUseEvent) -> None:
print(f"Tool executed: {event.tool_name} -> {event.tool_result}")
return None
# Input sanitizer - validate user prompts
async def sanitize_input(event: UserPromptSubmitEvent) -> HookDecision | None:
if "DELETE" in event.prompt.upper():
return HookDecision(
continue_=False,
reason="Dangerous keywords detected"
)
return None
# Register hooks in AgentOptions
options = AgentOptions(
system_prompt="You are a helpful assistant",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[my_file_tool, my_search_tool],
hooks={
HOOK_PRE_TOOL_USE: [security_gate],
HOOK_POST_TOOL_USE: [audit_logger],
HOOK_USER_PROMPT_SUBMIT: [sanitize_input],
}
)
async with Client(options) as client:
await client.query("Write to /etc/config") # UserPromptSubmit fires
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock): # PreToolUse fires
result = await tool.execute(block.input)
await client.add_tool_result(block.id, result) # PostToolUse fires
Hook Types
PreToolUse - Fires before tool execution (or yielding to user)
- Block operations: Return
HookDecision(continue_=False, reason="...") - Modify inputs: Return
HookDecision(modified_input={...}, reason="...") - Allow: Return
None
PostToolUse - Fires after tool result added to history
- Observational only (tool already executed)
- Use for audit logging, metrics, result validation
- Return
None(decision ignored for PostToolUse)
UserPromptSubmit - Fires before sending prompt to API
- Block prompts: Return
HookDecision(continue_=False, reason="...") - Modify prompts: Return
HookDecision(modified_prompt="...", reason="...") - Allow: Return
None
Common Patterns
Pattern 1: Redirect to Sandbox
async def redirect_to_sandbox(event: PreToolUseEvent) -> HookDecision | None:
"""Redirect file operations to safe directory."""
if event.tool_name == "file_writer":
path = event.tool_input.get("path", "")
if not path.startswith("/tmp/"):
safe_path = f"/tmp/sandbox/{path.lstrip('/')}"
return HookDecision(
modified_input={"path": safe_path, "content": event.tool_input.get("content", "")},
reason="Redirected to sandbox"
)
return None
Pattern 2: Compliance Audit Log
audit_log = []
async def compliance_logger(event: PostToolUseEvent) -> None:
"""Log all tool executions for compliance."""
audit_log.append({
"timestamp": datetime.now(),
"tool": event.tool_name,
"input": event.tool_input,
"result": str(event.tool_result)[:100],
"user": get_current_user()
})
return None
Pattern 3: Safety Instructions
async def add_safety_warning(event: UserPromptSubmitEvent) -> HookDecision | None:
"""Add safety instructions to risky prompts."""
if "write" in event.prompt.lower() or "delete" in event.prompt.lower():
safe_prompt = event.prompt + " (Please confirm this is safe before proceeding)"
return HookDecision(
modified_prompt=safe_prompt,
reason="Added safety warning"
)
return None
Hook Execution Flow
- Hooks run sequentially in the order registered
- First non-None decision wins (short-circuit behavior)
- Hooks run inline on event loop (spawn tasks for heavy work)
- Works with both Client and query() function
Breaking Change (v0.2.4)
Client.add_tool_result() is now async to support PostToolUse hooks:
# Old (v0.2.3 and earlier)
client.add_tool_result(tool_id, result)
# New (v0.2.4+)
await client.add_tool_result(tool_id, result)
Why Hooks?
- Security gates: Block dangerous operations before they execute
- Audit logging: Track all tool executions for compliance
- Input validation: Sanitize user prompts before processing
- Monitoring: Observe agent behavior in production
- Control flow: Modify tool inputs or redirect operations
See examples/hooks_example.py for 4 comprehensive patterns (security, audit, sanitization, combined).
Interrupt Capability
Cancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.
Quick Example
from open_agent import Client, AgentOptions
import asyncio
async def main():
options = AgentOptions(
system_prompt="You are a helpful assistant.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1"
)
async with Client(options) as client:
await client.query("Write a detailed 1000-word essay...")
# Timeout after 5 seconds
try:
async def collect_messages():
async for block in client.receive_messages():
print(block.text, end="", flush=True)
await asyncio.wait_for(collect_messages(), timeout=5.0)
except asyncio.TimeoutError:
await client.interrupt() # Clean cancellation
print("\nโ ๏ธ Operation timed out!")
# Client is still usable after interrupt
await client.query("Short question?")
async for block in client.receive_messages():
print(block.text)
Common Patterns
1. Timeout-Based Interruption
try:
await asyncio.wait_for(process_messages(client), timeout=10.0)
except asyncio.TimeoutError:
await client.interrupt()
print("Operation timed out")
2. Conditional Interruption
# Stop if response contains specific content
full_text = ""
async for block in client.receive_messages():
full_text += block.text
if "error" in full_text.lower():
await client.interrupt()
break
3. User Cancellation (from separate task)
async def stream_task():
await client.query("Long task...")
async for block in client.receive_messages():
print(block.text, end="")
async def cancel_button_task():
await asyncio.sleep(2.0) # User waits 2 seconds
await client.interrupt() # User clicks cancel
# Run both concurrently
await asyncio.gather(stream_task(), cancel_button_task())
4. Interrupt During Auto-Execution
options = AgentOptions(
tools=[slow_tool, fast_tool],
auto_execute_tools=True,
max_tool_iterations=10
)
async with Client(options) as client:
await client.query("Use tools...")
tool_count = 0
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
tool_count += 1
if tool_count >= 2:
await client.interrupt() # Stop after 2 tools
break
How It Works
When you call client.interrupt():
- Active stream closure - HTTP stream closed immediately (not just a flag)
- Clean state - Client remains in valid state for reuse
- Partial output - Text blocks flushed to history, incomplete tools skipped
- Idempotent - Safe to call multiple times
- Concurrent-safe - Can be called from separate asyncio tasks
Example
See examples/interrupt_demo.py for 5 comprehensive patterns:
- Timeout-based interruption
- Conditional interruption
- Auto-execution interruption
- Concurrent interruption (simulated cancel button)
- Interrupt and retry
๐ Practical Examples
We've included two production-ready agents that demonstrate real-world usage:
๐ Git Commit Agent
Analyzes your staged git changes and writes professional commit messages following conventional commit format.
# Stage your changes
git add .
# Run the agent
python examples/git_commit_agent.py
# Output:
# โ Found staged changes in 3 file(s)
# ๐ค Analyzing changes and generating commit message...
#
# ๐ Suggested commit message:
# feat(auth): Add OAuth2 integration with refresh tokens
#
# - Implement token refresh mechanism
# - Add secure cookie storage for tokens
# - Update login flow to support OAuth2 providers
# - Add tests for token expiration handling
Features:
- Analyzes diff to determine commit type (feat/fix/docs/etc)
- Writes clear, descriptive commit messages
- Interactive mode: accept, edit, or regenerate
- Follows conventional commit standards
๐ Log Analyzer Agent
examples/log_analyzer_agent.py
Intelligently analyzes application logs to identify patterns, errors, and provide actionable insights.
# Analyze a log file
python examples/log_analyzer_agent.py /var/log/app.log
# Analyze with a specific time window
python examples/log_analyzer_agent.py app.log --since "2025-10-15T00:00:00" --until "2025-10-15T12:00:00"
# Interactive mode for drilling down
python examples/log_analyzer_agent.py app.log --interactive
Features:
- Automatic error pattern detection
- Time-based analysis (peak error times)
- Root cause suggestions
- Interactive mode for investigating specific issues
- Supports multiple log formats (JSON, Apache, syslog, etc)
- Time range filtering with
--since/--until
Sample Output:
๐ Log Summary:
Total entries: 45,231
Errors: 127 (0.3%)
Warnings: 892
๐ด Top Error Patterns:
- Connection Error: 67 occurrences
- NullPointerException: 23 occurrences
- Timeout Error: 19 occurrences
โฐ Peak error time: 2025-10-15T14:00:00
Errors in that hour: 43
๐ค ANALYSIS REPORT:
Main Issues (Priority Order):
1. Database connection pool exhaustion during peak hours
2. Unhandled null values in user authentication flow
3. External API timeouts affecting payment processing
Recommendations:
1. Increase connection pool size from 10 to 25
2. Add null checks in AuthService.validateUser() method
3. Implement circuit breaker for payment API with 30s timeout
Why These Examples?
These agents demonstrate:
- Practical Value: Solve real problems developers face daily
- Tool Integration: Show how to integrate with system commands (git, file I/O)
- Multi-turn Conversations: Interactive modes for complex analysis
- Structured Output: Parse and format LLM responses for actionable results
- Privacy-First: Keep your code and logs local while getting AI assistance
Configuration
Open Agent SDK uses config helpers to provide flexible configuration via environment variables, provider shortcuts, or explicit parameters:
Environment Variables (Recommended)
export OPEN_AGENT_BASE_URL="http://localhost:1234/v1"
export OPEN_AGENT_MODEL="qwen/qwen3-30b-a3b-2507"
from open_agent import AgentOptions
from open_agent.config import get_model, get_base_url
# Config helpers read from environment
options = AgentOptions(
system_prompt="...",
model=get_model(), # Reads OPEN_AGENT_MODEL
base_url=get_base_url() # Reads OPEN_AGENT_BASE_URL
)
Provider Shortcuts
from open_agent.config import get_base_url
# Use built-in defaults for common providers
options = AgentOptions(
system_prompt="...",
model="llama3.1:70b",
base_url=get_base_url(provider="ollama") # โ http://localhost:11434/v1
)
Available providers: lmstudio, ollama, llamacpp, vllm
Fallback Values
# Provide fallbacks when env vars not set
options = AgentOptions(
system_prompt="...",
model=get_model("qwen2.5-32b-instruct"), # Fallback model
base_url=get_base_url(provider="lmstudio") # Fallback URL
)
Configuration Priority:
- Environment variable (default behaviour)
- Fallback value passed to the config helper
- Provider default (for
base_urlonly)
Need to force a specific model even when OPEN_AGENT_MODEL is set? Call get_model("model-name", prefer_env=False) to ignore the environment variable for that lookup.
Benefits:
- Switch between dev/prod by changing environment variables
- No hardcoded URLs or model names
- Per-agent overrides when needed
See docs/configuration.md for complete guide.
Why Not Just Use OpenAI Client?
Without open-agent-sdk (raw OpenAI client):
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
response = await client.chat.completions.create(
model="qwen2.5-32b-instruct",
messages=[{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}],
stream=True
)
async for chunk in response:
# Complex parsing of chunks
# Extract delta content
# Handle tool calls manually
# Track conversation state yourself
With open-agent-sdk:
from open_agent import query, AgentOptions
options = AgentOptions(
system_prompt=system_prompt,
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1"
)
result = query(prompt=user_prompt, options=options)
async for msg in result:
# Clean message types (TextBlock, ToolUseBlock)
# Automatic streaming and tool call handling
Value: Familiar patterns + Less boilerplate + Easy migration
Why Not LangChain?
Open Agent SDK and LangChain serve different needs:
Open Agent SDK is a focused library (~900 LOC) specifically for streaming conversations with local OpenAI-compatible models. Clean API, minimal dependencies (openai + pydantic), read the entire codebase in 10 minutes.
LangChain is a comprehensive framework (100k+ LOC) for building AI applications with 300+ integrations, RAG pipelines, document loaders, vector databases, and complex orchestration.
When to Use Each
Use Open Agent SDK when:
- Running local models (LM Studio, Ollama, llama.cpp)
- You want claude-agent-sdk style ergonomics
- You need minimal dependencies and fast install
- Building focused agents (copy editor, log analyzer, commit writer)
- You prefer readable code over framework abstractions
Use LangChain when:
- You need RAG with vector databases (Pinecone, Chroma, etc.)
- You want pre-built integrations (Google Search, document loaders, etc.)
- Building complex multi-agent orchestration systems
- Your team already knows LangChain
Philosophy: Open Agent SDK is "do one thing well" (like Flask), LangChain is "batteries included" (like Django). Both are excellent tools for their respective use cases.
API Reference
AgentOptions
class AgentOptions:
system_prompt: str # System prompt
model: str # Model name (required)
base_url: str # OpenAI-compatible endpoint URL (required)
tools: list[Tool] = [] # Tool instances for function calling
hooks: dict[str, list[HookHandler]] = None # Lifecycle hooks for monitoring/control
auto_execute_tools: bool = False # Enable automatic tool execution (v0.3.0+)
max_tool_iterations: int = 5 # Max tool calls per query in auto mode
max_turns: int = 1 # Max conversation turns
max_tokens: int | None = 4096 # Tokens to generate (None uses provider default)
temperature: float = 0.7 # Sampling temperature
timeout: float = 60.0 # Request timeout in seconds
api_key: str = "not-needed" # Most local servers don't need this
Note: Use config helpers (get_model(), get_base_url()) for environment variable and provider support.
query()
Simple single-turn query function.
async def query(prompt: str, options: AgentOptions) -> AsyncGenerator
Returns an async generator yielding messages.
Client
Multi-turn conversation client with tool monitoring.
async with Client(options: AgentOptions) as client:
await client.query(prompt: str)
async for msg in client.receive_messages():
# Process messages
Message Types
TextBlock- Text content from modelToolUseBlock- Tool calls from model (hasid,name,inputfields)ToolResultBlock- Tool execution results to send back to modelToolUseError- Tool call parsing error (malformed JSON, missing fields)AssistantMessage- Full message wrapper
Tool System
@tool(name: str, description: str, input_schema: dict)
async def my_tool(args: dict) -> Any:
"""Tool handler function"""
return result
# Tool class
class Tool:
name: str
description: str
input_schema: dict[str, type] | dict[str, Any]
handler: Callable[[dict], Awaitable[Any]]
async def execute(arguments: dict) -> Any
def to_openai_format() -> dict
Schema formats:
- Simple:
{"param": str, "count": int}- All parameters required - JSON Schema: Full schema with
type,properties,required, etc.
Hooks System
# Event types
@dataclass
class PreToolUseEvent:
tool_name: str
tool_input: dict[str, Any]
tool_use_id: str
history: list[dict[str, Any]]
@dataclass
class PostToolUseEvent:
tool_name: str
tool_input: dict[str, Any]
tool_result: Any
tool_use_id: str
history: list[dict[str, Any]]
@dataclass
class UserPromptSubmitEvent:
prompt: str
history: list[dict[str, Any]]
# Hook decision
@dataclass
class HookDecision:
continue_: bool = True
modified_input: dict[str, Any] | None = None
modified_prompt: str | None = None
reason: str | None = None
# Hook handler signature
HookHandler = Callable[[HookEvent], Awaitable[HookDecision | None]]
# Hook constants
HOOK_PRE_TOOL_USE = "pre_tool_use"
HOOK_POST_TOOL_USE = "post_tool_use"
HOOK_USER_PROMPT_SUBMIT = "user_prompt_submit"
Hook behavior:
- Return
Noneto allow by default - Return
HookDecision(continue_=False)to block - Return
HookDecision(modified_input={...})to modify (PreToolUse) - Return
HookDecision(modified_prompt="...")to modify (UserPromptSubmit) - Raise exception to abort entirely
Recommended Models
Local models (LM Studio, Ollama, llama.cpp):
- GPT-OSS-120B - Best in class for speed and quality
- Qwen 3 30B - Excellent instruction following, good for most tasks
- GPT-OSS-20B - Solid all-around performance
- Mistral 7B - Fast and efficient for simple agents
Cloud-proxied via local gateway (Ollama cloud provider, custom gateway):
- kimi-k2:1t-cloud - Tested and working via Ollama gateway
- deepseek-v3.1:671b-cloud - High-quality reasoning model
- qwen3-coder:480b-cloud - Code-focused models
- Your
base_urlstill points to localhost gateway (e.g.,http://localhost:11434/v1) - Gateway handles authentication and routing to cloud provider
- Useful when you need larger models than your hardware can run locally
Architecture guidance:
- Prefer MoE (Mixture of Experts) models over dense when available - significantly faster
- Start with 7B-30B models for most agent tasks - they're fast and capable
- Test models with your specific use case - the LLM landscape changes rapidly
Project Structure
open-agent-sdk/
โโโ open_agent/
โ โโโ __init__.py # query, Client, AgentOptions exports
โ โโโ client.py # Streaming query(), Client, tool helper
โ โโโ config.py # Env/provider helpers
โ โโโ context.py # Token estimation and truncation utilities
โ โโโ hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
โ โโโ tools.py # Tool decorator and schema conversion
โ โโโ types.py # Dataclasses for options and blocks
โ โโโ utils.py # OpenAI client + ToolCallAggregator
โโโ docs/
โ โโโ configuration.md
โ โโโ provider-compatibility.md
โ โโโ technical-design.md
โโโ examples/
โ โโโ git_commit_agent.py # ๐ Practical: Git commit message generator
โ โโโ log_analyzer_agent.py # ๐ Practical: Log file analyzer
โ โโโ calculator_tools.py # Function calling with @tool decorator
โ โโโ simple_tool.py # Minimal tool usage example
โ โโโ tool_use_agent.py # Complete tool use patterns
โ โโโ context_management.py # Manual history management patterns
โ โโโ hooks_example.py # Lifecycle hooks patterns (security, audit, sanitization)
โ โโโ interrupt_demo.py # Interrupt capability patterns (timeout, conditional, concurrent)
โ โโโ simple_lmstudio.py # Basic usage with LM Studio
โ โโโ ollama_chat.py # Multi-turn chat example
โ โโโ config_examples.py # Configuration patterns
โ โโโ simple_with_env.py # Environment variable config
โโโ tests/
โ โโโ integration/ # Integration-style tests using fakes
โ โ โโโ test_client_behaviour.py # Streaming, multi-turn, tool flow coverage
โ โโโ test_agent_options.py
โ โโโ test_auto_execution.py # Automatic tool execution
โ โโโ test_client.py
โ โโโ test_config.py
โ โโโ test_context.py # Context utilities (token estimation, truncation)
โ โโโ test_hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
โ โโโ test_interrupt.py # Interrupt capability (timeout, concurrent, reuse)
โ โโโ test_query.py
โ โโโ test_tools.py # Tool decorator and schema conversion
โ โโโ test_utils.py
โโโ CHANGELOG.md
โโโ pyproject.toml
โโโ README.md
Examples
๐ Practical Agents (Production-Ready)
git_commit_agent.pyโ Analyzes git diffs and writes professional commit messageslog_analyzer_agent.pyโ Parses logs, finds patterns, suggests fixes with interactive modetool_use_agent.pyโ Complete tool use patterns: manual, helper, and agent class
Core SDK Usage
simple_lmstudio.pyโ Minimal streaming query with hard-coded config (simplest quickstart)simple_with_env.pyโ Using environment variables with config helpers and fallbacksconfig_examples.pyโ Comprehensive reference: provider shortcuts, priority, and all config patternsollama_chat.pyโ Multi-turn chat loop with Ollama, including tool-call loggingcontext_management.pyโ Manual history management patterns (stateless, truncation, token monitoring, RAG-lite)hooks_example.pyโ Lifecycle hooks patterns (security gates, audit logging, input sanitization, combined)
Integration Tests
Located in tests/integration/:
test_client_behaviour.pyโ Fake AsyncOpenAI client covering streaming, multi-turn history, and tool-call flows without hitting real servers
Development Status
Released v0.1.0 โ Core functionality is complete and available on PyPI. Multi-turn conversations, tool monitoring, and streaming are fully implemented.
Roadmap
- Project planning and architecture
- Core
query()andClientclass - Tool monitoring +
Client.add_tool_result()helper - Tool use example (
examples/tool_use_agent.py) - PyPI release - Published as
open-agent-sdk - Provider compatibility matrix expansion
- Additional agent examples
Tested Providers
- โ
Ollama - Fully validated with
kimi-k2:1t-cloud(cloud-proxied model) - โ
LM Studio - Fully validated with
qwen/qwen3-30bmodel - โ llama.cpp - Fully validated with TinyLlama 1.1B model
See docs/provider-compatibility.md for detailed test results.
Documentation
- docs/technical-design.md - Architecture details
- docs/configuration.md - Configuration guide
- docs/provider-compatibility.md - Provider test results
- examples/ - Usage examples
Testing
Integration-style tests run entirely against lightweight fakes, so they are safe to execute locally and in pre-commit:
python -m pytest tests/integration
Add -k or a specific path when you want to target a subset of the unit tests (tests/test_client.py, etc.). If you use a virtual environment, prefix commands with ./venv/bin/python -m.
Pre-commit Hooks
Install hooks once per clone:
pip install pre-commit
pre-commit install
Running pre-commit run --all-files will execute formatting checks and the integration tests (python -m pytest tests/integration) before you push changes.
Requirements
- Python 3.10+
- openai 1.0+ (for AsyncOpenAI client)
- pydantic 2.0+ (for types, optional)
- Some servers require a dummy
api_key; set any non-empty string if needed
License
MIT License - see LICENSE for details.
Acknowledgments
- API design inspired by claude-agent-sdk
- Built for local/open-source LLM enthusiasts
Status: Alpha - API stabilizing, feedback welcome
Star โญ this repo if you're building AI agents with local models!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file open_agent_sdk-0.4.2.tar.gz.
File metadata
- Download URL: open_agent_sdk-0.4.2.tar.gz
- Upload date:
- Size: 94.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9c8e906dd92f5a935f72169d29df6ccb2d10cd75979e8fa7d7aa2ef8debf712
|
|
| MD5 |
00d29f65d2fd2a665ba5ab41bb93f64c
|
|
| BLAKE2b-256 |
3e300f8c9ba5ba788b859acd9de9e181fd15d659881d7ceb5821d1da2adaeeb9
|
File details
Details for the file open_agent_sdk-0.4.2-py3-none-any.whl.
File metadata
- Download URL: open_agent_sdk-0.4.2-py3-none-any.whl
- Upload date:
- Size: 59.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c3f218fb94a04730813efce6a762ccce84c583232c00899309d16c15a40b5d3
|
|
| MD5 |
e0a2d86c8bdb667e3b1fdd64f4e1f74f
|
|
| BLAKE2b-256 |
fc1613ff6bc54776051dd51318b33028bae3af6d6a7ded315cfb3284951728b1
|