Skip to main content

Generic AI agent gateway with MCP tool support and streaming

Project description

ai-agent-gateway

Deploy AI agents as production services.

Other frameworks help you define what your agent does. This handles everything around it — the HTTP server, session management, SSE streaming, tool dispatch, human-in-the-loop approval, and code execution sandboxing.

Start with a system prompt. Add MCP tools, local Python tools, skills, and code execution as you need them. Requires Python >= 3.10.

Install + Quick Start

create_agent() is the fastest path to a working agent server. It uses Anthropic by default, and you can switch to OpenAI with provider="openai". Use create_gateway_app() when you need lower-level runtime control.

Install the package and uvicorn:

pip install "ai-agent-gateway[anthropic]" uvicorn
export ANTHROPIC_API_KEY="your-anthropic-api-key"

For OpenAI instead:

pip install "ai-agent-gateway[openai]" uvicorn
export OPENAI_API_KEY="your-openai-api-key"

Create agent.py:

from agent_gateway import create_agent

app = create_agent("You are a concise research assistant.")

Run the server:

uvicorn agent:app --reload --port 8000

Create a session token:

SESSION_TOKEN=$(curl -s http://127.0.0.1:8000/api/chat/init \
  -H 'Content-Type: application/json' \
  -d '{"api_key":"local-demo-key"}' \
  | python3 -c 'import json,sys; print(json.load(sys.stdin)["session_token"])')

Chat with the agent:

curl -N http://127.0.0.1:8000/api/chat \
  -H "Authorization: Bearer $SESSION_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"messages": [{"role": "user", "content": "Give me three bullet points on why SSE is useful for chat UIs."}]}'

You will get an SSE stream like:

data: {"type":"text_delta","text":"- SSE lets the server push tokens as they are generated.\n"}

data: {"type":"text_delta","text":"- The browser can render partial output without polling.\n"}

data: {"type":"stream_complete","usage":{"input_tokens":...,"output_tokens":...}}

Full 5-minute walkthrough: Quickstart

Features

  • FastAPI server factory with /api/chat/init, /api/chat, /api/chat/tool-result, /api/chat/tool-approval, and /api/health
  • SSE event stream for text deltas, thinking deltas, tool calls, approval requests, tool output chunks, retries, and completion
  • JWT sessions with scoped approvals and isolated code execution directories
  • MCP tool discovery from inline config or ~/.claude.json
  • Local Python tool handlers with the same dispatch loop as MCP tools
  • Code execution with Docker preferred and subprocess fallback
  • Markdown skill files (prompt + config per task) and sub-agents via the built-in run_agent tool
  • Anthropic and OpenAI providers through create_agent() or create_gateway_app()
  • Headless execution via run_autonomous() for cron jobs and batch agents — same tool/MCP/skill infrastructure, no HTTP server
  • Heartbeat loop via HeartbeatLoop for persistent agents that check in periodically with quiet suppression, active hours, and backoff

You bring your system prompt, your tools (MCP servers, local Python handlers, or both), and your runtime policy. The gateway handles everything else.

Progressive Examples

Tier 1: System Prompt Only

from agent_gateway import create_agent

app = create_agent("You are a helpful assistant for spreadsheet users.")

Tier 2: Add MCP Tools

This uses an inline MCP server config. The example below assumes Node.js is installed because it runs an npx-based MCP server.

from agent_gateway import create_agent

app = create_agent(
  "You can inspect and edit files when needed.",
  mcp_servers={
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
    }
  },
)

Tier 3: Add Local Tools

from agent_gateway import create_agent


async def summarize_csv(tool_input, **_kwargs):
  path = tool_input["path"]
  return {"summary": f"Would summarize {path}"}, None


app = create_agent(
  "Use the summarize_csv tool when the user asks for a file summary.",
  tool_handlers={"summarize_csv": summarize_csv},
  tool_definitions=[
    {
      "name": "summarize_csv",
      "description": "Summarize a CSV file on disk.",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "Path to the CSV file."}
        },
        "required": ["path"],
      },
    }
  ],
)

Tier 4: Add Code Execution and Skills

code_execution=True prefers Docker when available and falls back to local subprocess execution otherwise.

from agent_gateway import create_agent

app = create_agent(
  "Use code execution for calculations and run_agent for focused subtasks.",
  code_execution=True,       # Adds code_execute tool (Docker preferred, subprocess fallback)
  skills_dir="skills",       # Each .md file becomes a named skill for run_agent
)

Tier 5: Headless / Autonomous

Run the same agent loop without an HTTP server — for cron jobs, batch tasks, or persistent daemons.

from agent_gateway import run_autonomous_sync, DeliveryConfig

result = run_autonomous_sync(
  "You are an operations monitor. Call check_status before replying.",
  "Check the current status and send a summary.",
  tool_handlers={"check_status": check_status},
  tool_definitions=[...],
  state_dir="./state",
  delivery=DeliveryConfig(telegram_bot_token="...", telegram_chat_id="..."),
)

For persistent agents that check in periodically, wrap with HeartbeatLoop:

from functools import partial
from agent_gateway import run_autonomous, HeartbeatLoop, HeartbeatConfig

loop = HeartbeatLoop(
  run_fn=partial(run_autonomous, system_prompt="...", initial_message="...", ...),
  config=HeartbeatConfig(interval_seconds=1800, active_hours=(6, 22), timezone="America/New_York"),
  on_alert=my_delivery_callback,  # called only when agent has something to report
)
await loop.start()

Graduate: Switch to create_gateway_app()

Use create_gateway_app() when you need custom approval logic, channel-aware runtimes, interceptors, multiple runtime profiles, or deeper production hooks.

from agent_gateway import (
  AnthropicProvider, ChatRuntime, GatewayServerConfig, create_gateway_app,
)

# Full control: custom providers, approval logic, channel routing, interceptors.
# See examples/07-full-production/ for the complete version.
app = create_gateway_app(
  GatewayServerConfig(
    build_chat_runtime=my_runtime_factory,
    default_provider=AnthropicProvider(),
  )
)

Runnable versions of these examples live in examples/.

Architecture

Three entry points share the same agent core:

create_agent()         --> FastAPI HTTP server (interactive chat)
run_autonomous()       --> Headless one-shot (cron / batch)
HeartbeatLoop          --> Persistent daemon (periodic check-in)
        |
        v
    AgentRunner (model loop: stream -> tool calls -> dispatch -> resume)
        |
        v
    ToolDispatcher
        |-- interceptors (rate limits, custom policies)
        |-- approval check (session-scoped, HTTP only)
        |-- local Python handler
        |-- MCP server (stdio)
        |-- code_execute (Docker / subprocess)
        |-- run_agent (sub-agent with own runner)
        |
        v
    EventLog --> SSE events (HTTP) or RunOutput (autonomous)

The same backend can serve multiple frontends. Pass context.channel to shape runtime behavior per client without rewriting the agent loop.

Comparison

Category ai-agent-gateway LangGraph LangChain CrewAI mcp-agent
Primary purpose Deploying agents as services Stateful workflow graphs LLM app building blocks Multi-agent role/task orchestration MCP-centric workflow orchestration
Agent logic Model-driven prompts with tools Code-defined graph nodes and edges Code-defined chains and agents Code-defined crews and tasks Code-defined workflows
Tool system MCP-native plus local handlers Bring your own adapters Bring your own adapters Custom tool abstractions MCP-native
Server/runtime FastAPI + SSE in core Bring your own or LangGraph Platform LangServe is separate Bring your own Bring your own
Sessions/auth JWT sessions in core Bring your own Bring your own Bring your own Bring your own
Human approval Built into tool dispatch Available through interrupt/checkpoint patterns Not a core runtime feature Human-input patterns available Not a core runtime feature
Best for Shipping a chat-facing agent backend quickly Explicit workflow control flow Reusable LLM components Team-style agent simulations MCP-heavy automation flows

When to use this package: you want users or clients talking to an agent over HTTP, and you do not want to build the session, SSE, approval, and tool-serving infrastructure yourself.

When not to use this package: you want explicit graph orchestration, or you are building a one-off notebook or script that does not need a reusable server runtime.

You can also combine them. For example, a LangGraph workflow can sit behind an ai-agent-gateway HTTP surface.

Documentation

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_agent_gateway-0.14.0.tar.gz (218.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_agent_gateway-0.14.0-py3-none-any.whl (167.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_agent_gateway-0.14.0.tar.gz.

File metadata

  • Download URL: ai_agent_gateway-0.14.0.tar.gz
  • Upload date:
  • Size: 218.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for ai_agent_gateway-0.14.0.tar.gz
Algorithm Hash digest
SHA256 6a3ee0a428fa90cc7291d21d57b4ea22ec59bbcca15d739824fdd89a49eafc1a
MD5 6a4f222e13212d112874fd08b9ada369
BLAKE2b-256 902377832b89a9edd8766e45ceccab81eaf7731aa5985092294058e475cc883d

See more details on using hashes here.

File details

Details for the file ai_agent_gateway-0.14.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_agent_gateway-0.14.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51dc3aa9bae1218d05b7c4df9d8fc8c7395bbf32ec0f6ac54e6968d5dd90452e
MD5 6d94956977ebd75ca39886ae49e7cf5d
BLAKE2b-256 fc2348a2253d7ccd94e485342d43b8226e9168914916558785f38af37eec7cba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page