Generic AI agent gateway with MCP tool support and streaming
Project description
ai-agent-gateway
Deploy AI agents as production services.
Other frameworks help you define what your agent does. This handles everything around it — the HTTP server, session management, SSE streaming, tool dispatch, human-in-the-loop approval, and code execution sandboxing.
Start with a system prompt. Add MCP tools, local Python tools, skills, and code execution as you need them. Requires Python >= 3.10.
Install + Quick Start
create_agent() is the fastest path to a working agent server. It uses Anthropic by default, and you can switch to OpenAI with provider="openai". Use create_gateway_app() when you need lower-level runtime control.
Install the package and uvicorn:
pip install "ai-agent-gateway[anthropic]" uvicorn
export ANTHROPIC_API_KEY="your-anthropic-api-key"
For OpenAI instead:
pip install "ai-agent-gateway[openai]" uvicorn
export OPENAI_API_KEY="your-openai-api-key"
Create agent.py:
from agent_gateway import create_agent
app = create_agent("You are a concise research assistant.")
Run the server:
uvicorn agent:app --reload --port 8000
Create a session token:
SESSION_TOKEN=$(curl -s http://127.0.0.1:8000/api/chat/init \
-H 'Content-Type: application/json' \
-d '{"api_key":"local-demo-key"}' \
| python3 -c 'import json,sys; print(json.load(sys.stdin)["session_token"])')
Chat with the agent:
curl -N http://127.0.0.1:8000/api/chat \
-H "Authorization: Bearer $SESSION_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"messages": [{"role": "user", "content": "Give me three bullet points on why SSE is useful for chat UIs."}]}'
You will get an SSE stream like:
data: {"type":"text_delta","text":"- SSE lets the server push tokens as they are generated.\n"}
data: {"type":"text_delta","text":"- The browser can render partial output without polling.\n"}
data: {"type":"stream_complete","usage":{"input_tokens":...,"output_tokens":...}}
Full 5-minute walkthrough: Quickstart
Features
- FastAPI server factory with
/api/chat/init,/api/chat,/api/chat/tool-result,/api/chat/tool-approval, and/api/health - SSE event stream for text deltas, thinking deltas, tool calls, approval requests, tool output chunks, retries, and completion
- JWT sessions with scoped approvals and isolated code execution directories
- MCP tool discovery from inline config or
~/.claude.json - Local Python tool handlers with the same dispatch loop as MCP tools
- Code execution with Docker preferred and subprocess fallback
- Markdown skill files (prompt + config per task) and sub-agents via the built-in
run_agenttool - Anthropic and OpenAI providers through
create_agent()orcreate_gateway_app() - Headless execution via
run_autonomous()for cron jobs and batch agents — same tool/MCP/skill infrastructure, no HTTP server - Heartbeat loop via
HeartbeatLoopfor persistent agents that check in periodically with quiet suppression, active hours, and backoff
You bring your system prompt, your tools (MCP servers, local Python handlers, or both), and your runtime policy. The gateway handles everything else.
Progressive Examples
Tier 1: System Prompt Only
from agent_gateway import create_agent
app = create_agent("You are a helpful assistant for spreadsheet users.")
Tier 2: Add MCP Tools
This uses an inline MCP server config. The example below assumes Node.js is installed because it runs an npx-based MCP server.
from agent_gateway import create_agent
app = create_agent(
"You can inspect and edit files when needed.",
mcp_servers={
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
}
},
)
Tier 3: Add Local Tools
from agent_gateway import create_agent
async def summarize_csv(tool_input, **_kwargs):
path = tool_input["path"]
return {"summary": f"Would summarize {path}"}, None
app = create_agent(
"Use the summarize_csv tool when the user asks for a file summary.",
tool_handlers={"summarize_csv": summarize_csv},
tool_definitions=[
{
"name": "summarize_csv",
"description": "Summarize a CSV file on disk.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "Path to the CSV file."}
},
"required": ["path"],
},
}
],
)
Tier 4: Add Code Execution and Skills
code_execution=True prefers Docker when available and falls back to local subprocess execution otherwise.
from agent_gateway import create_agent
app = create_agent(
"Use code execution for calculations and run_agent for focused subtasks.",
code_execution=True, # Adds code_execute tool (Docker preferred, subprocess fallback)
skills_dir="skills", # Each .md file becomes a named skill for run_agent
)
Tier 5: Headless / Autonomous
Run the same agent loop without an HTTP server — for cron jobs, batch tasks, or persistent daemons.
from agent_gateway import run_autonomous_sync, DeliveryConfig
result = run_autonomous_sync(
"You are an operations monitor. Call check_status before replying.",
"Check the current status and send a summary.",
tool_handlers={"check_status": check_status},
tool_definitions=[...],
state_dir="./state",
delivery=DeliveryConfig(telegram_bot_token="...", telegram_chat_id="..."),
)
For persistent agents that check in periodically, wrap with HeartbeatLoop:
from functools import partial
from agent_gateway import run_autonomous, HeartbeatLoop, HeartbeatConfig
loop = HeartbeatLoop(
run_fn=partial(run_autonomous, system_prompt="...", initial_message="...", ...),
config=HeartbeatConfig(interval_seconds=1800, active_hours=(6, 22), timezone="America/New_York"),
on_alert=my_delivery_callback, # called only when agent has something to report
)
await loop.start()
Graduate: Switch to create_gateway_app()
Use create_gateway_app() when you need custom approval logic, channel-aware runtimes, interceptors, multiple runtime profiles, or deeper production hooks.
from agent_gateway import (
AnthropicProvider, ChatRuntime, GatewayServerConfig, create_gateway_app,
)
# Full control: custom providers, approval logic, channel routing, interceptors.
# See examples/07-full-production/ for the complete version.
app = create_gateway_app(
GatewayServerConfig(
build_chat_runtime=my_runtime_factory,
default_provider=AnthropicProvider(),
)
)
Runnable versions of these examples live in examples/.
Architecture
Three entry points share the same agent core:
create_agent() --> FastAPI HTTP server (interactive chat)
run_autonomous() --> Headless one-shot (cron / batch)
HeartbeatLoop --> Persistent daemon (periodic check-in)
|
v
AgentRunner (model loop: stream -> tool calls -> dispatch -> resume)
|
v
ToolDispatcher
|-- interceptors (rate limits, custom policies)
|-- approval check (session-scoped, HTTP only)
|-- local Python handler
|-- MCP server (stdio)
|-- code_execute (Docker / subprocess)
|-- run_agent (sub-agent with own runner)
|
v
EventLog --> SSE events (HTTP) or RunOutput (autonomous)
The same backend can serve multiple frontends. Pass context.channel to shape runtime behavior per client without rewriting the agent loop.
Comparison
| Category | ai-agent-gateway | LangGraph | LangChain | CrewAI | mcp-agent |
|---|---|---|---|---|---|
| Primary purpose | Deploying agents as services | Stateful workflow graphs | LLM app building blocks | Multi-agent role/task orchestration | MCP-centric workflow orchestration |
| Agent logic | Model-driven prompts with tools | Code-defined graph nodes and edges | Code-defined chains and agents | Code-defined crews and tasks | Code-defined workflows |
| Tool system | MCP-native plus local handlers | Bring your own adapters | Bring your own adapters | Custom tool abstractions | MCP-native |
| Server/runtime | FastAPI + SSE in core | Bring your own or LangGraph Platform | LangServe is separate | Bring your own | Bring your own |
| Sessions/auth | JWT sessions in core | Bring your own | Bring your own | Bring your own | Bring your own |
| Human approval | Built into tool dispatch | Available through interrupt/checkpoint patterns | Not a core runtime feature | Human-input patterns available | Not a core runtime feature |
| Best for | Shipping a chat-facing agent backend quickly | Explicit workflow control flow | Reusable LLM components | Team-style agent simulations | MCP-heavy automation flows |
When to use this package: you want users or clients talking to an agent over HTTP, and you do not want to build the session, SSE, approval, and tool-serving infrastructure yourself.
When not to use this package: you want explicit graph orchestration, or you are building a one-off notebook or script that does not need a reusable server runtime.
You can also combine them. For example, a LangGraph workflow can sit behind an ai-agent-gateway HTTP surface.
Documentation
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_agent_gateway-0.5.3.tar.gz.
File metadata
- Download URL: ai_agent_gateway-0.5.3.tar.gz
- Upload date:
- Size: 114.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7e48117a23a8c1df29fa31120bfe67cbafa828aa9367b6948498950c8437d9f
|
|
| MD5 |
1ca2464e6a880fb0805a144371ca6f84
|
|
| BLAKE2b-256 |
68ffdc35430f309dc7357049eec66491e8e608ff646e3784e10c7e4612bf04bd
|
File details
Details for the file ai_agent_gateway-0.5.3-py3-none-any.whl.
File metadata
- Download URL: ai_agent_gateway-0.5.3-py3-none-any.whl
- Upload date:
- Size: 111.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7ca71924f46eca7844a93d8b3e7751c18485157d1a2056bff0b2d33fe63519f
|
|
| MD5 |
06a749efcfd552164ee32ae644ef415f
|
|
| BLAKE2b-256 |
5ddb131fbce10c031cf5886aed0a84a00819475400d36e0bb1841beea4ec5b33
|