Invocations protocol for Azure AI Hosted Agents

These details have not been verified by PyPI

Project links

repository

Project description

Azure AI Agent Server Invocations client library for Python

The azure-ai-agentserver-invocations package provides the invocation protocol endpoints for Azure AI Hosted Agent containers. It plugs into the azure-ai-agentserver-core host framework and supports two transports on the same host:

HTTP (invocations protocol) — POST /invocations, GET /invocations/{id}, POST /invocations/{id}/cancel, GET /invocations/docs/openapi.json.
WebSocket (invocations_ws protocol) — full-duplex streaming at /invocations_ws, registered with @app.ws_handler.

Getting started

Install the package

pip install azure-ai-agentserver-invocations

This automatically installs azure-ai-agentserver-core as a dependency.

Prerequisites

Python 3.10 or later

Key concepts

InvocationAgentServerHost

InvocationAgentServerHost is an AgentServerHost subclass that adds invocation protocol endpoints. It provides decorator methods for registering handler functions:

@app.invoke_handler — Required. Handles POST /invocations.
@app.get_invocation_handler — Optional. Handles GET /invocations/{id}.
@app.cancel_invocation_handler — Optional. Handles POST /invocations/{id}/cancel.
@app.ws_handler — Optional. Handles WebSocket connections at /invocations_ws.

Protocol endpoints

Method	Route	Required	Description
`POST`	`/invocations`	Yes	Execute the agent
`GET`	`/invocations/{invocation_id}`	No	Retrieve invocation status or result
`POST`	`/invocations/{invocation_id}/cancel`	No	Cancel a running invocation
`GET`	`/invocations/docs/openapi.json`	No	Serve the agent's OpenAPI 3.x spec
`WS`	`/invocations_ws`	No	Full-duplex WebSocket transport (`invocations_ws` protocol)

Request and response headers

The SDK automatically manages these headers on every invocation:

Header	Direction	Description
`x-agent-invocation-id`	Request & Response	Echoed if provided, otherwise a UUID is generated
`x-agent-session-id`	Response (POST only)	Resolved from `agent_session_id` query param, `FOUNDRY_AGENT_SESSION_ID` env var, or generated UUID

Session ID resolution

Session IDs group related invocations into a conversation. The SDK resolves the session ID in order:

agent_session_id query parameter on POST /invocations
FOUNDRY_AGENT_SESSION_ID environment variable
Auto-generated UUID

The resolved session ID is available in handler functions via request.state.session_id.

Handler access to SDK state

Inside handler functions, the SDK sets these attributes on request.state:

request.state.invocation_id — The invocation ID (echoed or generated).
request.state.session_id — The resolved session ID (POST /invocations only).
request.state.user_id — The per-user ID from x-agent-user-id (container protocol 2.0.0); use for per-user state.
request.state.call_id — The per-request call ID from x-agent-foundry-call-id (protocol 2.0.0); forward on outbound Foundry calls.

Distributed tracing

When tracing is enabled on the AgentServerHost, invocation spans are automatically created with GenAI semantic conventions:

Span name: invoke_agent {FOUNDRY_AGENT_NAME}:{FOUNDRY_AGENT_VERSION}
Span attributes: gen_ai.system, gen_ai.operation.name, gen_ai.response.id, gen_ai.agent.id, gen_ai.agent.name, gen_ai.agent.version, microsoft.session.id
Error tags: azure.ai.agentserver.invocations.error.code, .error.message
Baggage keys: azure.ai.agentserver.invocation_id, .session_id

Examples

Simple synchronous agent

from azure.ai.agentserver.invocations import InvocationAgentServerHost
from starlette.requests import Request
from starlette.responses import JSONResponse, Response

app = InvocationAgentServerHost()


@app.invoke_handler
async def handle(request: Request) -> Response:
    data = await request.json()
    return JSONResponse({"greeting": f"Hello, {data['name']}!"})

app.run()

Multi-user session (per-request call ID)

On container protocol 2.0.0 a single agent session can serve multiple users. Forwarding the per-request x-agent-foundry-call-id on outbound toolbox calls lets the tool server resolve which user made this request and act on their behalf. (x-agent-user-id is never forwarded; the tool resolves the user from the call ID server-side. Use request.state.user_id only for the container's own per-user state.)

import os

import httpx
from azure.ai.agentserver.core import get_request_context
from azure.ai.agentserver.invocations import InvocationAgentServerHost
from starlette.requests import Request
from starlette.responses import JSONResponse, Response

app = InvocationAgentServerHost()


@app.invoke_handler
async def handle(request: Request) -> Response:
    # platform_headers() echoes x-agent-foundry-call-id only (never x-agent-user-id).
    headers = get_request_context().platform_headers()

    # Toolbox / MCP — attach the call ID PER CALL (the MCP session is shared across users/turns).
    async with httpx.AsyncClient() as mcp:
        resp = await mcp.post(
            f"{os.environ['FOUNDRY_PROJECT_ENDPOINT']}/toolboxes/github/mcp",
            headers={"Authorization": f"Bearer {get_agent_token()}", **headers},  # get_agent_token(): the agent's managed-identity token
            json={"jsonrpc": "2.0", "method": "tools/call",
                  "params": {"name": "list_my_assigned_issues", "arguments": {}}},
        )
        # The toolbox resolved the caller from the call ID and returned THIS user's issues.

    return JSONResponse(resp.json())

app.run()

Long-running operations with polling

import asyncio
import json

from azure.ai.agentserver.invocations import InvocationAgentServerHost
from starlette.requests import Request
from starlette.responses import JSONResponse, Response

_tasks: dict[str, asyncio.Task] = {}
_results: dict[str, bytes] = {}

app = InvocationAgentServerHost()


@app.invoke_handler
async def handle(request: Request) -> Response:
    data = await request.json()
    invocation_id = request.state.invocation_id
    task = asyncio.create_task(do_work(invocation_id, data))
    _tasks[invocation_id] = task
    return JSONResponse({"invocation_id": invocation_id, "status": "running"})

@app.get_invocation_handler
async def get_invocation(request: Request) -> Response:
    invocation_id = request.state.invocation_id
    if invocation_id in _results:
        return Response(content=_results[invocation_id], media_type="application/json")
    return JSONResponse({"invocation_id": invocation_id, "status": "running"})

@app.cancel_invocation_handler
async def cancel_invocation(request: Request) -> Response:
    invocation_id = request.state.invocation_id
    if invocation_id in _tasks:
        _tasks[invocation_id].cancel()
        del _tasks[invocation_id]
        return JSONResponse({"invocation_id": invocation_id, "status": "cancelled"})
    return JSONResponse({"error": "not found"}, status_code=404)

Streaming (Server-Sent Events)

import json

from azure.ai.agentserver.invocations import InvocationAgentServerHost
from starlette.requests import Request
from starlette.responses import Response, StreamingResponse

app = InvocationAgentServerHost()


@app.invoke_handler
async def handle(request: Request) -> Response:
    async def generate():
        for word in ["Hello", " ", "world", "!"]:
            yield json.dumps({"delta": word}).encode() + b"\n"

    return StreamingResponse(generate(), media_type="text/event-stream")

Multi-turn conversation

Use the agent_session_id query parameter to group invocations into a conversation:

# First turn
curl -X POST "http://localhost:8088/invocations?agent_session_id=session-abc" \
    -H "Content-Type: application/json" \
    -d '{"message": "My name is Alice"}'

# Second turn (same session)
curl -X POST "http://localhost:8088/invocations?agent_session_id=session-abc" \
    -H "Content-Type: application/json" \
    -d '{"message": "What is my name?"}'

The session ID is available in the handler via request.state.session_id.

Serving an OpenAPI spec

Pass an OpenAPI spec dict to enable the discovery endpoint at GET /invocations/docs/openapi.json:

app = InvocationAgentServerHost(openapi_spec={
    "openapi": "3.0.3",
    "info": {"title": "My Agent", "version": "1.0.0"},
    "paths": { ... },
})

WebSocket protocol (`invocations_ws`)

The same InvocationAgentServerHost object also exposes a WebSocket transport at /invocations_ws. Container authors do not install or import a second package — registering an @app.ws_handler is the only step. A multi-protocol agent shares one host, one session, and one process.

Quick start

from azure.ai.agentserver.invocations import InvocationAgentServerHost
from starlette.requests import Request
from starlette.responses import JSONResponse, Response
from starlette.websockets import WebSocket

app = InvocationAgentServerHost()


@app.invoke_handler                 # POST /invocations (HTTP)
async def invoke(request: Request) -> Response:
    payload = await request.json()
    return JSONResponse({"echo": payload})


@app.ws_handler                     # /invocations_ws (WebSocket)
async def ws(websocket: WebSocket) -> None:
    async for message in websocket.iter_text():
        await websocket.send_text(message)


app.run()

What the SDK does for `@app.ws_handler`

Registers /invocations_ws on the same Starlette host as /invocations and /readiness.
Calls await websocket.accept() before invoking your handler.
Runs WebSocket Ping/Pong keep-alive in the background — disabled by default; enable by setting the WS_KEEPALIVE_INTERVAL environment variable (auto-injected by AgentService into hosted-agent containers). Set the value to 0 to disable. Frames are sent at the WebSocket protocol layer (RFC 6455 opcode 0x9/0xA) by the underlying Hypercorn server, which keeps the connection alive across upstream proxy / load-balancer idle timeouts without any extra application traffic.
Closes the connection cleanly on handler return (close code 1000) or maps an uncaught handler exception to close code 1011.
Emits a structured close-event log line carrying azure.ai.agentserver.invocations_ws.session_id, azure.ai.agentserver.invocations_ws.close_code, and azure.ai.agentserver.invocations_ws.duration_ms. The same fields are recorded as OpenTelemetry span attributes so the connection lifetime is visible end-to-end.
Inherits /readiness, OpenTelemetry export, graceful shutdown, and the x-platform-server identity header from azure-ai-agentserver-core.

Per-connection tracing

A WebSocket connection is wrapped by the SDK in a single connection-scoped websocket_session OpenTelemetry span. The span carries the GenAI semantic-convention attributes plus azure.ai.agentserver.invocations_ws.session_id, close_code, and duration_ms. Any child spans your handler opens — e.g. via opentelemetry.trace.get_tracer(...).start_as_current_span(...) — are automatically parented to the connection span.

Handler signature

The handler receives a Starlette WebSocket and returns None. The full WebSocket API — iter_text, iter_bytes, iter_json, send_text, send_bytes, send_json, close, headers, query_params, client, state — is available, so application protocols on top of invocations_ws are entirely under your control.

Reference: configuration

Environment variable	Default	Description
`WS_KEEPALIVE_INTERVAL`	unset (disabled)	Platform-injected WebSocket Ping interval, in seconds. `0` disables keep-alive. Surfaced on `app.config.ws_ping_interval` and wired into Hypercorn's `websocket_ping_interval` by `AgentServerHost`.

Reference: close codes

Close code	Meaning
`1000`	Handler returned cleanly (normal close).
`1011`	Handler raised an unhandled exception (mapped by the SDK).
`4000`-`4999`	Application-defined codes (set by the handler via `await websocket.close(code=...)` — surfaced unchanged to the client).

Troubleshooting

Reporting issues

To report an issue with the client library, or request additional features, please open a GitHub issue here.

Next steps

Visit the Samples folder for complete working examples:

Sample	Description
simple_invoke_agent	Minimal synchronous request-response
async_invoke_agent	Long-running operations with polling and cancellation
ws_invoke_agent	Combined `POST /invocations` (HTTP) and `/invocations_ws` (WebSocket) host
ws_bidirectional_streaming_agent	Full-duplex `/invocations_ws` agent: concurrent token streams + mid-flight cancel (relies on the SDK's WS protocol Ping/Pong keep-alive, not application-level heartbeats)

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information, see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Project details

These details have not been verified by PyPI

Project links

repository

Release history Release notifications | RSS feed

This version

1.0.0b6 pre-release

Jun 28, 2026

1.0.0b5 pre-release

Jun 12, 2026

1.0.0b4 pre-release

May 21, 2026

1.0.0b3 pre-release

Apr 23, 2026

1.0.0b2 pre-release

Apr 19, 2026

1.0.0b1 pre-release

Apr 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azure_ai_agentserver_invocations-1.0.0b6.tar.gz (60.5 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

azure_ai_agentserver_invocations-1.0.0b6-py3-none-any.whl (20.3 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file azure_ai_agentserver_invocations-1.0.0b6.tar.gz.

File metadata

Download URL: azure_ai_agentserver_invocations-1.0.0b6.tar.gz
Upload date: Jun 28, 2026
Size: 60.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: RestSharp/106.13.0.0

File hashes

Hashes for azure_ai_agentserver_invocations-1.0.0b6.tar.gz
Algorithm	Hash digest
SHA256	`b8c04aa71dc42491f75443c5123d7ecf9c40318e35a05bb056e9786e98585fbd`
MD5	`4dde08ff5f836f0f54b5a517aaf273b3`
BLAKE2b-256	`0df4c4ff1399795dd92fd56290ac3a1014817125e7fd44eb5a205fce043f0b46`

See more details on using hashes here.

File details

Details for the file azure_ai_agentserver_invocations-1.0.0b6-py3-none-any.whl.

File metadata

Download URL: azure_ai_agentserver_invocations-1.0.0b6-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 20.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: RestSharp/106.13.0.0

File hashes

Hashes for azure_ai_agentserver_invocations-1.0.0b6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d264d9ad76a218097c1f3cce157698c972bb24a49c11cab934fe5db429fae6c5`
MD5	`daf87128ed744a3c35cd016ac3ecf33e`
BLAKE2b-256	`ff46ba7e11ab9dfdcdacd9e7d9cc824a03ee9dfcf9c46a5dc7a92028637da48c`

See more details on using hashes here.

azure-ai-agentserver-invocations 1.0.0b6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Azure AI Agent Server Invocations client library for Python

Getting started

Install the package

Prerequisites

Key concepts

InvocationAgentServerHost

Protocol endpoints

Request and response headers

Session ID resolution

Handler access to SDK state

Distributed tracing

Examples

Simple synchronous agent

Multi-user session (per-request call ID)

Long-running operations with polling

Streaming (Server-Sent Events)

Multi-turn conversation

Serving an OpenAPI spec

WebSocket protocol (invocations_ws)

Quick start

What the SDK does for @app.ws_handler

Per-connection tracing

Handler signature

Reference: configuration

Reference: close codes

Troubleshooting

Reporting issues

Next steps

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

WebSocket protocol (`invocations_ws`)

What the SDK does for `@app.ws_handler`