Skip to main content

Product-agnostic agent SDK with normalized provider adapters

Project description

Unified Agents SDK

Unified Agents SDK

One API. Every LLM. Any tool.
Normalize OpenAI, Anthropic, Gemini, Groq, DeepSeek, Mistral, xAI/Grok โ€” and any OpenAI-compatible endpoint โ€” behind a single REST/SSE interface with MCP, RAG, and bring-your-own-key support.

PyPI Quick Start API Postman License

Python FastAPI Providers MCP Ruff

CI


๐Ÿค” Why?

Every LLM provider speaks a different dialect. Tool calling, streaming, context โ€” all incompatible. You end up writing provider-specific glue code everywhere.

Unified Agents SDK solves it once. Three ways to use it:

Before โ€” provider-specific glue everywhere

# Different tool schema for every provider
openai_client.chat.completions.create(
    model="gpt-4o",
    tools=[{"type": "function", "function": {
        "name": "search", "parameters": {...}
    }}],
)

anthropic_client.messages.create(
    model="claude-opus-4-5",
    tools=[{"name": "search", "input_schema": {...}}],
)

groq_client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    tools=[{"type": "function", "function": {...}}],
)
# ...repeat for every provider, every format change

After โ€” one interface, any provider

Mode 1 โ€” CLI (no server, no code)

uag chat "Search and summarise" --profile claude
uag chat "Explain ML" --stream
uag providers

Mode 2 โ€” Python (embed in your app, swap providers with one string)

from runtime.router import create_provider, resolve_provider_config
from config.settings import ProviderSettings, GatewaySettings
from core.agent_loop import AgentLoop
from core.types import NormalizedMessage, Role

provider_settings = ProviderSettings()
gateway_settings = GatewaySettings()

messages = [NormalizedMessage(role=Role.USER, content="Search and summarise")]

# Switch provider by changing one argument โ€” code is identical
for profile in ["default", "claude", "gemini", "fast", "deep", "grok"]:
    cfg = resolve_provider_config(
        provider_settings, gateway_settings, profile=profile
    )
    loop = AgentLoop(provider=create_provider(cfg))
    response = await loop.run_conversation(messages)
    print(f"[{cfg.provider_name}] {response.messages[-1].content}")

Mode 3 โ€” HTTP / REST (language-agnostic)

curl http://localhost:8000/agent-query \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Search and summarise",
    "profile": "claude",
    "runtime": {"mcp_namespaces": ["search"]}
  }'

โœจ What You Get

๐Ÿ”Œ Provider Adapters

OpenAI ยท Anthropic ยท Gemini
Groq ยท DeepSeek ยท Mistral ยท xAI/Grok
Together ยท Ollama ยท Azure ยท any OAI endpoint

๐Ÿ› ๏ธ MCP Native

Gateway-managed (any provider) or
server-side (OpenAI / Gemini / Groq / xAI).
Named presets, inline spec, or profile-bound.

๐Ÿ’‰ Context Injection

RAG ยท KV ยท static text ยท ContextForge
Injected before every model call.
Per-profile or per-request.

๐Ÿ“ก SSE Streaming

Real-time token streaming
for every provider.
Same event shape, always.


๐Ÿ—บ๏ธ Feature Matrix

Every provider runs the same unified agent runtime (multi-hop tool loop, shared HTTP API, normalized types). Each column is one vendor integration through that runtimeโ€”not a comparison of who โ€œhas an agent.โ€ Rows below call out optional vendor-specific surfaces (built-in tools, how PDFs are passed, Mistralโ€™s separate Agents HTTP API, etc.).

Using OpenAI? Use the OpenAI Responses column (openai_responses adapter / "openai-r" profile). OpenAIโ€™s Responses API is a superset of Chat Completions and is recommended for all new projectsโ€”it adds built-in tools, MCP, stateful context, and better reasoning. The OAI-compatยน column is the generic adapter for third-party OpenAI-compatible endpoints (Together AI, Ollama, Azure OpenAI, local models, etc.) that do not expose the Responses API.

Legend

Symbol Meaning
โœ… Supported and wired in this SDK for that columnโ€™s adapter.
โ€” Genuine gap in this SDK for that column, or a real vendor limitation. See When the matrix shows a dash.
โ€  Vendor-specific flow โ€” see Footnotes (โ€ ) below the table.
โ€ก Automatic platform behaviour โ€” no request change needed; see note below the table.
Feature OAI-compatยน OpenAI Responses Anthropic Gemini Groq DeepSeek Mistral xAI/Grok
Sync chat โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
SSE streaming โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Tool / function calling โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Multi-hop tool loops โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Context injection โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
MCP tool auto-discovery โœ… โœ… โœ… โœ… โœ… โœ… โ€” โœ…
Server-side built-in tools โ€” โœ… โœ… โœ… โœ… โ€” โœ… โœ…
Extended thinking / reasoning โ€” โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Vision / multimodal input โœ… โœ… โœ… โœ… โœ… โ€” โœ… โœ…
Structured outputs (JSON schema) โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Prompt caching โ€” โœ… โœ… โœ… โœ…โ€ก โœ… โ€” โœ…
Citations โ€” โœ… โœ… โœ…โ€  โœ… โ€” โ€” โœ…
Document / PDF input โ€” โœ…โ€  โœ… โœ…โ€  โœ… โ€” โœ… โ€”
Live web search โ€” โœ… โœ… โœ… โœ… โ€” โœ… โœ…
Mistral Agents API (agent_id) โ€” โ€” โ€” โ€” โ€” โ€” โœ… โ€”
Inline BYOK credentials โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…

ยน OAI-compat = openai_compatible adapterโ€”the generic Chat Completions path for any OpenAI-format endpoint (Together AI, Ollama, Azure OpenAI, local models, etc.). For OpenAI itself, use the OpenAI Responses column.

โ€ก Groq prompt caching is automatic (prefix reuse, no request change needed). cached_tokens is reported in the usage response via prompt_tokens_details and x_groq DRAM/SRAM breakdown.

Mistral Agents API (agent_id): This row is only Mistral's separate Agents HTTP APIโ€”optional agent_id routing to agents.complete / server-defined agents and built-in tools. All other providers still run full agent loops through their own chat or Responses-style APIs; they do not expose Mistral's branded Agents API, which is why the column is empty outside Mistral.

Footnotes (โ€ )

Citations ยท Gemini (โœ…โ€ ): Grounding with Google Search returns grounding_metadata (web chunks, queries, support references). This SDK extracts and surfaces it in the usage response key grounding_metadata โ€” see providers/gemini.py. It is citation-like grounding, not an inline text-annotation object like Anthropic's.

Document / PDF ยท Gemini (โœ…โ€ ): Upload files with the Gemini Files API, then include the file URI in user message parts (type: "file", uri, mime_type) โ€” see _convert_user_content_parts in providers/gemini.py.

Document / PDF ยท OpenAI Responses (โœ…โ€ ): Pass {"type": "file_search", "vector_store_ids": [...]} in built_in_tools. This SDK wires it through _to_tools in providers/openai_responses.py. See OpenAI's file inputs guide for creating files and passing them inline.

When the matrix shows a dash

Every โ€” is a genuine gap in this SDK or a real vendor limitation.

Row ยท column Reason
MCP ยท Mistral Mistral supports MCP via RunContext + Agents SDK โ€” a different API pattern. This gateway has not yet bridged it to the unified MCP_SERVERS / mcp_namespaces path.
Server-side built-in tools ยท OAI-compatยน Hosted tools require the Responses API. The openai_compatible column is generic Chat Completions โ€” bring your own function tools.
Server-side built-in tools ยท DeepSeek DeepSeek has no hosted tool service. All tools are user-supplied via function calling.
Extended thinking ยท OAI-compatยน Reasoning parameters are wired only in the OpenAI Responses adapter. The generic openai_compatible column does not map them (though extra kwargs are forwarded).
Vision ยท DeepSeek The official DeepSeek API (deepseek-chat, deepseek-reasoner) does not support image input. DeepSeek VL is a separate product.
Prompt caching ยท OAI-compatยน The openai_compatible adapter is generic Chat Completions. OpenAI's server-side prefix cache is implicit; no explicit cache-control API on this path.
Prompt caching ยท Mistral Mistral has no documented explicit prompt-cache API.
Citations ยท OAI-compatยน Inline url_citation annotations require Responses + web search tool. Generic Chat Completions returns no citation annotation objects.
Citations ยท DeepSeek / Mistral Neither API returns structured citation annotation objects.
Document / PDF ยท OAI-compatยน The generic openai_compatible Chat Completions path has no native document block. Use the openai_responses column with file_search (see โ€  above).
Document / PDF ยท DeepSeek DeepSeek's chat API has no native document or PDF block.
Document / PDF ยท xAI xAI Grok has no documented inline file/PDF input on the Responses path used by this SDK.
Live web search ยท OAI-compatยน Hosted web search requires the Responses API. Generic Chat Completions uses function tools you supply.
Live web search ยท DeepSeek No hosted search tool; implement search as a custom function.

Curated agent harnesses

These are additional surfaces beyond the core LLM columns above. They reuse the same UAG patterns: BaseProvider + NormalizedResponse / StreamEvent where a programmatic driver exists, ContextRegistry for on-disk rules, and GatewaySettings.MCP_SERVERS for HTTP/SSE MCP bridges.

Shared utilities (no duplicated walk logic): context/md_hierarchy.py, tools/mcp_config_loader.py, one AgentHarnessSettings class in config/settings.py, and optional AGENTS.md via context/agents_md.py.

MCP columns explained: Bridge = reads the harness's own config file at startup and adds those servers as named presets in MCP_SERVERS (per-request, not auto-connected). Gateway-managed = gateway connects and executes tools, works with any provider. Agent-managed = MCP config passed directly to the agent; the agent runs its own tool loop, gateway does not execute tools. See MCP โ€” Model Context Protocol for the full explanation.

Harness Provider How MCP works Context bridge Notes
Claude Agent SDK claude_agent Agent-managed. Pass mcp_servers in options or AgentProfile.extra โ†’ forwarded to ClaudeAgentOptions. Claude Agent SDK subprocess connects and runs its own tool loop. Gateway ToolRegistry is ignored. CLAUDE.md / skills loaded by SDK when setting_sources set in extra pip install 'unified-agents-sdk[claude-agent]'; wraps claude_agent_sdk.query.
Gemini CLI โ€” (no SDK) Bridge only. GEMINI_CLI_MCP_BRIDGE=true reads ~/.config/gemini/settings.json, adds HTTP/SSE servers to MCP_SERVERS as named presets. Use any standard provider with mcp_namespaces to call them. stdio-only servers in the config are skipped. GEMINI_CLI_MD_ENABLED, GEMINI_CLI_SKILLS_ENABLED register gemini_md / gemini_skills context sources No headless Gemini CLI API; context + MCP preset bridge only.
Cursor Cloud Agents cursor_cloud_agent Not supported. Cursor Cloud Agent REST API does not expose an MCP endpoint for callers. โ€” REST job runner + webhook proxy. CURSOR_API_KEY, repository in AgentProfile.extra.
Codex CLI codex Inverted bridge. CODEX_MCP_ENABLED=true starts codex mcp-server (stdio subprocess) at bootstrap and loads its tools into the global ToolRegistry โ€” making Codex tools available to other providers, not the Codex provider itself. Codex's own tool use is managed by Codex internally via ~/.codex/config.toml. AGENTS_MD_ENABLED; or pass project_doc path in extra codex -q subprocess by default; use_app_server=true for JSON-RPC mode.
Windsurf Cascade โ€” (no SDK) Bridge only. WINDSURF_MCP_BRIDGE=true reads ~/.codeium/windsurf/mcp_config.json, adds HTTP/SSE servers to MCP_SERVERS as named presets. WINDSURF_RULES_ENABLED loads .windsurf/rules/ No headless Cascade API; context + MCP preset bridge only.
Cline โ€” (no SDK) Not applicable. Cline is a VS Code extension; no MCP bridge or remote API. CLINE_RULES_ENABLED loads .clinerules Rules file bridge only.
GitHub Copilot copilot (preview) Bridge only. COPILOT_MCP_BRIDGE=true adds github_mcp preset pointing at https://api.githubcopilot.com/mcp with your GitHub token. Use any standard provider with mcp_namespaces: ["github_mcp"] to get repos/issues/PRs/code-search tools. copilot provider itself does not support tool calling (preview). โ€” pip install 'unified-agents-sdk[copilot]'; token via COPILOT_GITHUB_TOKEN / GH_TOKEN.

Optional install groups: [claude-agent], [codex] (binary separate), [copilot].


๐Ÿ“ฆ Dependencies

All constraints are declared in pyproject.toml. Upper bounds prevent silent breakage from future major-version API changes.

Provider SDKs

Provider Package Tested version Min required Adapter
OpenAI / OAI-compat openai 2.30.0 >=2.0,<3 openai_compatible, openai_responses, xai
Anthropic anthropic 0.86.0 >=0.86,<1 anthropic
Google Gemini google-genai 1.47.0 >=1.47,<2 gemini
Groq groq 1.0.0 >=1.0,<2 groq
Mistral mistralai 1.10.0 >=1.10,<2 mistral
MCP mcp 1.26.0 >=1.26,<2 All (tool auto-discovery)
Claude Agent SDK claude-agent-sdk optional [claude-agent] extra claude_agent provider
GitHub Copilot SDK github-copilot-sdk optional [copilot] extra copilot provider

Note: openai>=2.0 is required โ€” the Responses API (client.responses.create) only exists in the v2 SDK. The xai adapter uses the same SDK pointed at https://api.x.ai/v1.

Framework & transport

Package Tested version Constraint Purpose
fastapi 0.135.2 >=0.115,<1 HTTP / SSE server
uvicorn[standard] 0.42.0 >=0.30,<1 ASGI server
httpx 0.28.1 >=0.27,<1 Async HTTP (OAI-compat, DeepSeek)
pydantic 2.12.5 >=2.0,<3 Data models & validation
pydantic-settings 2.13.1 >=2.0,<3 Env-based config
anyio 4.13.0 >=4.0,<5 Async primitives
typer 0.24.1 >=0.12,<1 CLI (uag serve)
rich 14.3.3 >=13.0,<15 CLI output formatting

๐Ÿš€ Quick Start

Option A โ€” pip install (recommended)

pip install unified-agents-sdk

Option B โ€” from source

git clone https://github.com/PhilipAD/Unified-Agents-SDK.git
cd unified-agents-sdk
pip install -e ".[dev]"

Configure

cp .env.example .env
# Add at minimum one provider key:
#   OPENAI_API_KEY=sk-...

Start the server

uag serve
# or: uag serve --reload --port 8000

Server: http://localhost:8000 ยท Swagger UI: http://localhost:8000/docs

First call (HTTP)

curl -s http://localhost:8000/agent-query \
  -H "Content-Type: application/json" \
  -d '{"input": "What is the capital of France?"}' | python -m json.tool
{
  "output": "The capital of France is Paris.",
  "tool_traces": [],
  "usage": {"input_tokens": 14, "output_tokens": 9},
  "provider": "openai_compatible",
  "model": "gpt-4o",
  "warnings": [],
  "errors": []
}

First call (CLI -- no server needed)

uag chat "What is the capital of France?"
uag chat "Explain quantum computing" --profile claude --stream
uag chat "2+2?" --json

๐Ÿ—๏ธ Architecture

  Your app / curl / Postman
         โ”‚
         โ–ผ
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚   POST /agent-query   ยท   POST /agent-query/stream     โ”‚
  โ”‚          FastAPI HTTP + SSE layer  (api/http.py)       โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
               โ”‚       AgentLoop       โ”‚  core/agent_loop.py
               โ”‚  1. Inject context    โ”‚
               โ”‚  2. Call provider     โ”‚
               โ”‚  3. Execute tools     โ”‚ โ—„โ”€โ”€โ”€ gateway executes
               โ”‚  4. Loop until done   โ”‚      MCP tools here
               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      (Path 1)
                      โ”‚        โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”  โ”Œโ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚  ToolRegistry  โ”‚  โ”‚ ContextRegistry  โ”‚
        โ”‚   tools/       โ”‚  โ”‚   context/       โ”‚
        โ”‚  โ€ข MCP tools   โ”‚โ—„โ”€โ”ค  โ€ข AGENTS.md     โ”‚
        โ”‚    (gateway-   โ”‚  โ”‚  โ€ข CLAUDE.md     โ”‚
        โ”‚     managed)   โ”‚  โ”‚  โ€ข rules files   โ”‚
        โ”‚  โ€ข HTTP tools  โ”‚  โ”‚  โ€ข RAG / HTTP    โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚                  Provider Adapters                       โ”‚
        โ”‚  openai_compatible โ”‚ openai_responses โ”‚ anthropic        โ”‚
        โ”‚  gemini โ”‚ groq โ”‚ deepseek โ”‚ mistral โ”‚ xai               โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                 โ”‚
                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                   โ”‚  Provider-native MCP        โ”‚  (Path 2)
                   โ”‚  openai_responses / xai /   โ”‚
                   โ”‚  groq / gemini only         โ”‚
                   โ”‚  LLM backend calls MCP      โ”‚
                   โ”‚  server directly            โ”‚
                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

  At startup (bootstrap.py):
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚  MCP Bridges (optional, read existing agent configs)    โ”‚
  โ”‚  GEMINI_CLI_MCP_BRIDGE  โ†’ reads settings.json          โ”‚
  โ”‚  WINDSURF_MCP_BRIDGE    โ†’ reads mcp_config.json        โ”‚  โ†’ merge into
  โ”‚  COPILOT_MCP_BRIDGE     โ†’ builds github_mcp preset     โ”‚    MCP_SERVERS
  โ”‚  CODEX_MCP_ENABLED      โ†’ starts codex mcp-server,     โ”‚    named presets
  โ”‚                           loads tools globally          โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Layer reference

Layer Package Responsibility
Config config/ Pydantic-settings: env-loaded profiles, MCP presets, context presets
Core types core/ Normalized messages, tool calls, responses, stream events
Providers providers/ 8 adapters: translate normalized types to/from each provider's native API
Tools tools/ Registry, MCP loader, inline MCP HTTP client
Context context/ Registry, ContextForge adapter, RAG/KV fetch
Runtime runtime/ Router, profile resolution, bootstrap, SSE helpers
API api/ FastAPI endpoints, dynamic registry composition

โš™๏ธ Configuration

All settings are env vars (.env file supported). Full reference in .env.example.

Provider selection
# Built-in providers โ€” set whichever you use
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AI...
GROQ_API_KEY=gsk-...
DEEPSEEK_API_KEY=sk-ds-...
MISTRAL_API_KEY=...
XAI_API_KEY=xai-...

# Extra OpenAI-compatible providers (Together, Ollama, Azureโ€ฆ)
OPENAI_COMPATIBLE_PROVIDERS={"together":{"api_key":"...","base_url":"https://api.together.xyz/v1","model":"meta-llama/Llama-3.3-70B-Instruct-Turbo"}}
Agent profiles

Select provider + model + presets per request with a single "profile" field.

AGENT_PROFILES={
  "default":    {"provider_name": "openai_compatible"},
  "openai-r":   {"provider_name": "openai_responses", "model": "gpt-4o"},
  "claude":     {"provider_name": "anthropic", "model": "claude-opus-4-5"},
  "gemini":     {"provider_name": "gemini", "model": "gemini-2.5-pro"},
  "fast":       {"provider_name": "groq", "model": "llama-3.3-70b-versatile"},
  "deep":       {"provider_name": "deepseek", "model": "deepseek-reasoner"},
  "mistral":    {"provider_name": "mistral", "model": "mistral-large-latest"},
  "grok":       {"provider_name": "xai", "model": "grok-4.20-reasoning"},
  "researcher": {
    "provider_name": "openai_responses",
    "mcp_namespaces": ["search"],
    "context_names":  ["company_info"]
  }
}

Profile-level mcp_namespaces and context_names are automatically merged into every request using that profile โ€” callers don't need to repeat them.

Named MCP servers

Define once in .env, reference by name in API calls. Credentials never leave the server.

MCP_SERVERS={
  "search": {"url": "http://search-mcp.internal/mcp",
             "transport": "streamable_http",
             "headers": {"Authorization": "Bearer sk-xyz"}},
  "files":  {"url": "http://files-mcp.internal/sse", "transport": "sse"}
}
Named context sources
NAMED_CONTEXTS={
  "company_info": {"mode": "static",
                   "text": "We are Acme Corp, a global e-commerce platform."},
  "product_faq":  {"mode": "http", "source": "rag",
                   "url": "http://kb.internal/search",
                   "payload_template": {"query": "{input}"},
                   "max_chars": 4000}
}

๐Ÿ“ก API Usage

Both endpoints accept the same JSON body.

Sync query

curl -s http://localhost:8000/agent-query \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Summarise the refund policy.",
    "profile": "default",
    "context": {"system_prompt": "You are a helpful support agent."},
    "runtime": {
      "context_names": ["company_info"],
      "mcp_namespaces": ["ticketing"]
    }
  }'

Streaming (SSE)

curl -N http://localhost:8000/agent-query/stream \
  -H "Content-Type: application/json" \
  -d '{"input": "Write a haiku about distributed systems."}'
event: chunk
data: {"type": "chunk", "delta": "Nodes whisper in time\n"}

event: chunk
data: {"type": "chunk", "delta": "Consensus blooms like spring rain\n"}

event: done
data: {"type": "done", "usage": {"input_tokens": 12, "output_tokens": 17}}

Inline MCP server (per-request)

curl -s http://localhost:8000/agent-query \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Create a GitHub issue for the pagination bug.",
    "runtime": {
      "mcp_servers": [{
        "url": "https://mcp.github.example.com/mcp",
        "namespace": "github",
        "transport": "streamable_http",
        "headers": {"Authorization": "Bearer ghp_..."}
      }]
    }
  }'

Inline HTTP tool

curl -s http://localhost:8000/agent-query \
  -H "Content-Type: application/json" \
  -d '{
    "input": "What is the weather in London?",
    "runtime": {
      "tools": [{
        "name": "weather",
        "description": "Get current weather for a city",
        "json_schema": {
          "type": "object",
          "properties": {"city": {"type": "string"}},
          "required": ["city"]
        },
        "url": "https://api.weather.example.com/current",
        "method": "GET",
        "argument_mode": "query"
      }]
    }
  }'

Bring-your-own-key (BYOK)

Requires ALLOW_PER_REQUEST_PROVIDER_CREDENTIALS=true in .env.

curl -s http://localhost:8000/agent-query \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello!",
    "provider_credentials": {
      "api_key": "sk-user-supplied",
      "model": "gpt-4o-mini"
    }
  }'

๐Ÿ”Œ MCP โ€” Model Context Protocol

MCP is a standard for exposing tools to AI agents. An MCP server is a process (local or remote) that advertises a list of tools and executes them when called. This SDK wires MCP into the agent loop in two completely different ways depending on what you are using.


Two MCP paths

Path 1 โ€” Gateway-managed MCP (works with every provider)

The gateway connects to MCP servers, discovers their tools, and passes them to the LLM as ordinary function definitions. When the LLM calls a tool, the gateway executes it and returns the result โ€” the LLM never touches the MCP server directly.

Request โ†’ _compose_registries()
               โ”œโ”€ InlineMCPClient.connect(url)    โ† gateway connects
               โ”œโ”€ list_tools()                    โ† gateway discovers tools
               โ””โ”€ ToolRegistry.register(...)      โ† stored for this request

AgentLoop
    โ””โ”€ provider.run(messages, tools=[ToolDefinition, ...])
         โ””โ”€ LLM returns tool_call
              โ””โ”€ AgentLoop executes it โ†’ result sent back to LLM โ†’ loop

This works with every provider (Anthropic, OpenAI, Gemini, Groq, Mistral, DeepSeek, xAI, any OAI-compatible endpoint) because from the provider's perspective it is just receiving a list of function definitions โ€” it has no idea they came from MCP.

Three ways to attach MCP servers per request:

1. Named preset โ€” define credentials once in .env, reference by name:

# .env
MCP_SERVERS={
  "search": {"url": "http://search-mcp.internal/mcp",
             "transport": "streamable_http",
             "headers": {"Authorization": "Bearer sk-..."}},
  "github": {"url": "https://api.githubcopilot.com/mcp",
             "headers": {"Authorization": "Bearer ghp_..."}}
}
POST /agent-query
{
  "input": "Search for recent papers on RAG",
  "runtime": { "mcp_namespaces": ["search"] }
}

2. Permanently bound to a profile โ€” callers never have to specify it:

AGENT_PROFILES={
  "researcher": {
    "provider_name": "anthropic",
    "mcp_namespaces": ["search", "github"]
  }
}

Any request with "profile": "researcher" automatically gets those MCP tools. No mcp_namespaces needed in the request body.

3. Inline spec โ€” full connection details in the request body, no pre-registration needed:

POST /agent-query
{
  "input": "Create a GitHub issue for the pagination bug",
  "runtime": {
    "mcp_servers": [{
      "url": "https://api.githubcopilot.com/mcp",
      "namespace": "github",
      "transport": "streamable_http",
      "headers": {"Authorization": "Bearer ghp_..."}
    }]
  }
}

Path 2 โ€” Provider-native (server-side) MCP

Some providers support receiving MCP server specs and connecting to them themselves. The gateway is not involved in tool execution at all โ€” the LLM backend calls the MCP server directly. You pass the MCP config in options per request:

POST /agent-query
{
  "profile": "openai-r",
  "input": "Search and summarise recent news on LLMs",
  "options": {
    "mcp_servers": [
      {
        "type": "mcp",
        "server_url": "https://my-search-mcp.com/mcp",
        "server_label": "search",
        "require_approval": "never",
        "headers": {"Authorization": "Bearer sk-..."}
      }
    ]
  }
}

Providers that support this and their specific field names:

Provider Profile options key Notes
OpenAI Responses openai-r mcp_servers Also supports connector_id, defer_loading
xAI / Grok grok mcp_servers Same wire format; no connector_id
Groq fast mcp_servers Routes through Groq's Responses API path
Gemini gemini mcp_servers Converted to genai_types.McpServer; streamable HTTP only

When to use Path 2 instead of Path 1: when you want the LLM provider's infrastructure to call the MCP server (lower latency for remote tools, no gateway round-trip per tool call), or when the MCP server requires direct auth that you do not want the gateway to proxy.


MCP bridges โ€” what they are and why

Several curated agent harnesses (Gemini CLI, Windsurf, Codex, GitHub Copilot) each maintain their own MCP server configs in their own config files on disk. An MCP bridge is a bootstrap-time reader that parses those existing config files and translates them into named MCPServerPreset entries that the UAG gateway can use.

Why this exists: You may already have MCP servers configured in Windsurf or Gemini CLI. Bridges mean you do not have to re-enter the same server URLs and credentials into .env โ€” the gateway reads the config files those agents already use and surfaces those servers through the unified MCP_SERVERS preset system.

What a bridge is not: A bridge does not automatically connect to anything at startup. It registers the MCP servers as named presets โ€” callers still reference them by namespace in their requests. The connection itself is always per-request.

Enable bridges in .env:

# Parse ~/.config/gemini/settings.json and merge servers into MCP_SERVERS
GEMINI_CLI_MCP_BRIDGE=true

# Parse ~/.codeium/windsurf/mcp_config.json and merge servers into MCP_SERVERS
WINDSURF_MCP_BRIDGE=true

# Add GitHub's remote MCP server as a named preset (github_mcp)
COPILOT_MCP_BRIDGE=true
COPILOT_GITHUB_TOKEN=ghp_...    # or GH_TOKEN / GITHUB_TOKEN

# Start codex mcp-server (stdio) at startup and load tools globally
CODEX_MCP_ENABLED=true

Each bridge and what it does at startup:

Bridge flag Source file read Result
GEMINI_CLI_MCP_BRIDGE ~/.config/gemini/settings.json (or GEMINI_CLI_SYSTEM_CONFIG_DIR) Parses mcpServers, adds HTTP/SSE servers as named presets in MCP_SERVERS (stdio servers skipped โ€” not remotely accessible)
WINDSURF_MCP_BRIDGE ~/.codeium/windsurf/mcp_config.json (or WINDSURF_MCP_CONFIG_PATH) Same pattern: HTTP/SSE servers added as named presets
COPILOT_MCP_BRIDGE No file โ€” constructs preset from token + URL Adds a github_mcp named preset pointing at https://api.githubcopilot.com/mcp with your GitHub Bearer token
CODEX_MCP_ENABLED No file โ€” starts subprocess Spawns codex mcp-server over stdio, loads its tools into the global ToolRegistry at startup โ€” the only bridge that auto-connects

After bridges run, all merged servers are available as named presets just like any MCP_SERVERS entry:

{ "runtime": { "mcp_namespaces": ["github_mcp", "search"] } }

MCP for curated agent harnesses

Claude Agent, Codex CLI, and GitHub Copilot behave differently from standard providers. They each manage their own execution environments โ€” the gateway's ToolRegistry is not forwarded to them.

Claude Agent SDK (claude_agent provider)

MCP servers are configured inside ClaudeAgentOptions and handled entirely by the Claude Agent SDK subprocess. The gateway's tool list is dropped with a warning if passed.

Configure via options in the request or permanently in AgentProfile.extra:

POST /agent-query
{
  "profile": "claude_agent",
  "input": "Search the web for recent AI safety papers",
  "options": {
    "mcp_servers": [
      {
        "type": "stdio",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-brave-search"],
        "env": {"BRAVE_API_KEY": "sk-..."}
      }
    ],
    "allowed_tools": ["mcp__brave-search__brave_web_search"],
    "permission_mode": "acceptEdits"
  }
}

Or permanently in .env:

AGENT_PROFILES={
  "claude_agent": {
    "provider_name": "claude_agent",
    "model": "claude-opus-4-5",
    "extra": {
      "mcp_servers": [{"type": "stdio", "command": "npx",
                       "args": ["-y", "@modelcontextprotocol/server-brave-search"],
                       "env": {"BRAVE_API_KEY": "sk-..."}}],
      "allowed_tools": ["mcp__brave-search__brave_web_search"]
    }
  }
}

The Claude Agent SDK connects to those MCP servers internally and runs the full tool-calling loop itself. The gateway receives only the final text output.

Codex CLI (codex provider)

Codex manages its own tools via its own config (~/.codex/config.toml) and sandbox. The gateway tool list is dropped with a warning.

Codex MCP works the other direction: when CODEX_MCP_ENABLED=true, the gateway starts codex mcp-server as a stdio subprocess at startup and loads the tools it exposes into the global ToolRegistry. Those tools then become available to other providers (Anthropic, OpenAI, etc.) โ€” not to the Codex provider itself.

CODEX_MCP_ENABLED=true        # start codex mcp-server at startup
CODEX_BINARY=codex            # path to binary (default: codex)

After this, any standard provider can use Codex tools:

{
  "profile": "anthropic",
  "input": "Run the test suite and report failures",
  "runtime": { "use_global_tools": true }
}

GitHub Copilot (copilot provider)

The Copilot SDK provider is a technical preview and does not support tool calling. However, GitHub's own MCP server (https://api.githubcopilot.com/mcp) works fully via the gateway using Path 1 โ€” the gateway connects to it and any standard provider can use GitHub's tools (repos, issues, PRs, code search):

COPILOT_MCP_BRIDGE=true
COPILOT_GITHUB_TOKEN=ghp_...
COPILOT_MCP_TOOLSETS=["repos","issues","pulls"]   # optional filter
{
  "profile": "anthropic",
  "input": "List open PRs in my org and summarise the review status",
  "runtime": { "mcp_namespaces": ["github_mcp"] }
}

Postman: The collectionโ€™s folder 11 โ€” MCP README scenarios contains saved bodies for Path 1 (github_mcp preset), Path 2 (options.mcp_servers on openai-r), Claude Agent MCP, and the Codex inverted-bridge note. Folders 6.8โ€“6.10 cover md_hierarchy / md_files / md_glob dynamic contexts. See Postman Collection.


๐Ÿ“‹ Request Schema

Top-level fields
Field Type Required Description
input string yes User message / instruction
profile string no Agent profile name (default: "default")
agent_id string no Agent identifier; combined with profile for lookup
context object no system_prompt + any template variables
options object no Extra kwargs forwarded to provider (temperature, max_tokens, โ€ฆ)
runtime object no Per-request tool / context overrides
provider_credentials object no BYOK: api_key, model, base_url
runtime fields
Field Type Default Description
use_global_tools bool true Include globally registered tools
use_global_contexts bool true Include globally registered contexts
namespace string โ€” Prefix for all tools/contexts registered in this request
mcp_namespaces string[] [] Keys from MCP_SERVERS in .env โ€” connect per-request
context_names string[] [] Keys from NAMED_CONTEXTS in .env โ€” inject per-request
mcp_servers object[] [] Inline MCP: url, namespace, transport, headers
tools object[] [] Inline HTTP tools: name, description, json_schema, url, method
contexts object[] [] Inline contexts: name, mode, text / url

Full schema and SSE event reference: docs/API_SPEC.md


๐Ÿ“ฌ Postman Collection

Import postman/unified-agents-sdk.postman_collection.json for 62 ready-made requests across 11 folders. The collection info.description is kept aligned with this README (MCP two paths, bridges, dynamic markdown contexts, BYOK).

Folder Requests What it covers
1 โ€” Basic Queries 3 Minimal, system prompt, agent_id targeting
2 โ€” Provider Profiles 9 Default, Claude, Gemini, Groq, DeepSeek, options, openai-r, xAI Grok, Mistral
3 โ€” Named Presets 5 MCP namespaces, context names, profile-baked presets, mixed inline
4 โ€” Inline MCP Servers 5 Streamable HTTP, SSE, auth headers, multi-server, tenant namespace
5 โ€” Inline HTTP Tools 5 POST/JSON, GET/query, multi-tool, auth, isolated
6 โ€” Inline Contexts 10 Static, templates, RAG, GET, multi-context, ContextForge, isolated, md_hierarchy, md_files, md_glob
7 โ€” Kitchen Sink 3 Full combined, sandboxed, DevOps (MCP + HTTP tool + RAG)
8 โ€” Streaming (SSE) 5 Minimal, profile, context, MCP, warning event
9 โ€” Error Cases 6 403 dynamic registration, unknown preset, MCP failure, context skip, bad key
10 โ€” BYOK 7 api_key, model, base_url, Groq, Anthropic, stream, 403, empty credentials
11 โ€” MCP README scenarios 4 github_mcp bridge preset, Path 2 options.mcp_servers (openai-r), Claude Agent mcp_servers, Codex inverted bridge (global codex.* tools)

Set the base_url collection variable to your server address. Use mcp_server_url, openai_key, anthropic_key, groq_key where the request body references {{โ€ฆ}}.


๐Ÿ”ง Extending the SDK

Add a new LLM provider
  1. Create providers/myprovider.py subclassing BaseProvider.
  2. Implement run() (sync) and stream() (async generator of StreamEvent).
  3. Register in runtime/router.py:
from providers.myprovider import MyProvider
PROVIDERS["myprovider"] = MyProvider
Add a new tool source

Register any async callable into ToolRegistry:

from tools.registry import ToolRegistry, ToolSource

registry.register(
    name="my_tool",
    description="Does something useful",
    json_schema={"type": "object", "properties": {"q": {"type": "string"}}},
    source=ToolSource.PYTHON,
    handler=my_async_fn,
)
Add a new context source
from context.registry import ContextRegistry, ContextSource, RegisteredContext

async def fetch_my_context(**kwargs) -> str:
    return "relevant background information"

registry.register(RegisteredContext(
    name="my_context",
    source=ContextSource.RAG,
    fetch=fetch_my_context,
))

๐Ÿ–ฅ๏ธ CLI Reference

After pip install, the uag command is available globally.

uag serve            Start the HTTP gateway
uag chat "prompt"    Send a query directly (no server needed)
uag providers        List registered providers and config status
uag serve
uag serve                          # defaults: 0.0.0.0:8000
uag serve --port 3000 --reload     # dev mode
uag serve --workers 4              # production
uag chat
uag chat "What is 2+2?"                         # default profile
uag chat "Explain ML" --profile claude           # specific provider
uag chat "Write a poem" --stream                 # stream tokens live
uag chat "Summarise this" --json                 # raw JSON output
uag chat "Be brief" --system "You are terse."    # custom system prompt
uag providers
uag providers
# Prints a table of all providers, their adapter class, env key, and whether configured

๐Ÿงช Running Tests

# Full suite โ€” no live API keys needed
make test

# With coverage
make test-cov

# Or directly with pytest
pytest -q -m "not integration"

238 tests, all passing, all offline.


๐Ÿ“ Project Structure

unified-agents-sdk/
โ”œโ”€โ”€ api/               FastAPI HTTP/SSE endpoints + dynamic registry composition
โ”œโ”€โ”€ config/            Pydantic-settings: providers, profiles, MCP & context presets
โ”œโ”€โ”€ context/           Context registry, ContextForge adapter
โ”œโ”€โ”€ core/              Normalized types, agent loop, durable execution primitives
โ”œโ”€โ”€ docs/              Architecture reference and API specification
โ”œโ”€โ”€ postman/           Postman collection (62 requests, 11 folders)
โ”œโ”€โ”€ providers/         OpenAI, OpenAI Responses, Anthropic, Gemini, Groq, DeepSeek, Mistral, xAI adapters
โ”œโ”€โ”€ runtime/           Router, profile resolution, bootstrap, SSE helpers
โ”œโ”€โ”€ tests/             pytest test suite (238 tests, all offline)
โ”œโ”€โ”€ tools/             Tool registry, MCP loader, inline MCP HTTP client
โ”œโ”€โ”€ cli.py             Typer CLI (uag serve / chat / providers)
โ”œโ”€โ”€ main.py            Application entry point (uvicorn)
โ”œโ”€โ”€ Makefile           Dev task runner (make test, make lint, make serve, ...)
โ”œโ”€โ”€ .env.example       Fully documented environment variable reference
โ”œโ”€โ”€ py.typed           PEP 561 typed package marker
โ””โ”€โ”€ pyproject.toml     Package metadata, build config, ruff + pytest config

๐Ÿ—บ๏ธ Roadmap

Version What Status
v0.1.0 Core gateway: 3 providers, tool loop, SSE, MCP, named presets, BYOK, 101 tests โœ… Shipped
v0.2.0 Full provider coverage: 8 dedicated adapters (OpenAI, OpenAI Responses, Anthropic, Gemini, Groq, DeepSeek, Mistral, xAI/Grok), extended thinking/reasoning, server-side tools, multimodal I/O, citations, 200 tests โœ… Shipped
v0.3.0 providers/_shared.py, OpenAI-compat usage normalisation (input_tokens/output_tokens), Gemini Vertex/http_options, MCP + Postman docs (62 requests), README MCP two-path guide โœ… Shipped
v0.4 Auth middleware, rate limiting, request logging ๐Ÿ”œ Planned
v0.5 Agent handoffs โ€” native multi-agent delegation via call_agent meta-tool ๐Ÿ”œ Planned
v0.6 Durable execution โ€” resume interrupted runs, persistent step records ๐Ÿ’ก Exploring
v0.7 Provider marketplace โ€” plug-in registry for community adapters ๐Ÿ’ก Exploring
v1.0 Production-grade โ€” auth, permissions, audit logs, HA deployment guide ๐Ÿ’ก Exploring

๐Ÿค Contributing

Contributions welcome! See CONTRIBUTING.md for dev setup, standards, and the PR process.

Ideas especially wanted:

  • ๐Ÿ”Œ New provider adapters โ€” Cohere, Bedrock, Azure OpenAI
  • ๐Ÿ› ๏ธ New tool sources โ€” gRPC, GraphQL, database queries
  • ๐Ÿ’‰ New context sources โ€” vector stores, custom KV, document stores
  • ๐Ÿ“– Documentation โ€” tutorials, examples, integration guides

๐Ÿ”’ Security

See SECURITY.md for the vulnerability disclosure policy.

Safe defaults: API keys load from .env server-side. BYOK (provider_credentials) and dynamic runtime registration require explicit opt-in via environment flags.


๐Ÿ“„ License

MIT โ€” see LICENSE.


Unified Agents SDK โ€” One API. Every LLM. Any tool.


Star this repo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unified_agents_sdk-0.3.0.tar.gz (313.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unified_agents_sdk-0.3.0-py3-none-any.whl (104.2 kB view details)

Uploaded Python 3

File details

Details for the file unified_agents_sdk-0.3.0.tar.gz.

File metadata

  • Download URL: unified_agents_sdk-0.3.0.tar.gz
  • Upload date:
  • Size: 313.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for unified_agents_sdk-0.3.0.tar.gz
Algorithm Hash digest
SHA256 67a29a9a7ada7d9ff00f00dd06b7519af1d61e864f443c89bc334eea8bf1f094
MD5 56be4f212ac1e4f6993465cd19278655
BLAKE2b-256 6dc7182e63900a14daf43547273c728da66cadbb942324b8bb7032fa4300db96

See more details on using hashes here.

File details

Details for the file unified_agents_sdk-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for unified_agents_sdk-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1ae2c7525a47d2fc3d8b8eff2fc802ed690dc2c822413f74dea5ef9465c6cd6d
MD5 cb6cf62db9058e35b2bb93912be2f90e
BLAKE2b-256 9842315be9e8a074c04e6c24614e9704594e257728ceae4ff759fc0d569b46b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page