Skip to main content

Python SDK for building servers implementing the Azure AI Responses protocol

Project description

Azure AI Agent Server Responses client library for Python

The azure-ai-agentserver-responses package provides the Responses protocol endpoints for Azure AI Hosted Agent containers. It plugs into the azure-ai-agentserver-core host framework and adds the full response lifecycle: create, stream (SSE), cancel, delete, replay, and input-item listing.

Getting started

Install the package

pip install azure-ai-agentserver-responses

This automatically installs azure-ai-agentserver-core as a dependency.

Prerequisites

  • Python 3.10 or later

Key concepts

ResponsesAgentServerHost

ResponsesAgentServerHost is an AgentServerHost subclass that adds Responses protocol endpoints. Register your handler with the @app.response_handler decorator:

@app.response_handler
def my_handler(
    request: CreateResponse, context: ResponseContext, cancellation_signal: asyncio.Event
):
    ...

Protocol endpoints

Method Route Description
POST /responses Create a new response
GET /responses/{response_id} Get response state (JSON or SSE replay via ?stream=true)
POST /responses/{response_id}/cancel Cancel an in-flight response
DELETE /responses/{response_id} Delete a stored response
GET /responses/{response_id}/input_items List input items (paginated)

TextResponse

The simplest way to return text. Handles the full SSE lifecycle automatically (response.createdresponse.in_progress → message/content events → response.completed):

return TextResponse(context, request, text="Hello!")

For streaming, pass an async iterable to text:

async def tokens():
    for t in ["Hello", ", ", "world!"]:
        yield t

return TextResponse(context, request, text=tokens())

ResponseEventStream

Use ResponseEventStream when you need function calls, reasoning items, multiple output types, or fine-grained event control. Each yield maps 1:1 to an SSE event with zero bookkeeping:

stream = ResponseEventStream(response_id=context.response_id, request=request)
yield stream.emit_created()
yield stream.emit_in_progress()
yield from stream.output_item_message("Hello, world!")
yield stream.emit_completed()

Drop down to the builder API for full control over individual events:

message = stream.add_output_item_message()
yield message.emit_added()
text = message.add_text_content()
yield text.emit_added()
yield text.emit_delta("Hello!")
yield text.emit_text_done()
yield text.emit_done()
yield message.emit_done()

ResponseContext

The ResponseContext provides request-scoped state:

Property / Method Description
response_id Unique ID for this response
is_shutdown_requested Whether the server is draining
isolation IsolationContext with user_key and chat_key for multi-tenant state partitioning
client_headers Dictionary of x-client-* headers forwarded from the platform (keys normalized to lowercase)
query_parameters Dictionary of query string parameters
get_input_items() Load resolved input items as Item subtypes
get_input_text() Extract all text content from input items as a single string
get_history() Load conversation history items

Streaming and background modes

The SDK automatically handles all combinations of stream and background flags:

  • Default — Run to completion, return final JSON response
  • Streaming — Pipe events as SSE in real-time, cancel on client disconnect
  • Background — Return immediately, handler runs in the background
  • Streaming + Background — SSE while connected, handler continues after disconnect

Response lifecycle

The library orchestrates the complete response lifecycle: createdin_progresscompleted (or failed / cancelled). Cancellation, error handling, and terminal event guarantees are all managed automatically.

For detailed handler implementation guidance, see docs/handler-implementation-guide.md.

Examples

Echo handler

import asyncio

from azure.ai.agentserver.responses import (
    CreateResponse,
    ResponseContext,
    ResponsesAgentServerHost,
    TextResponse,
)

app = ResponsesAgentServerHost()


@app.response_handler
async def handler(request: CreateResponse, context: ResponseContext, cancellation_signal: asyncio.Event):
    text = await context.get_input_text()
    return TextResponse(context, request, text=f"Echo: {text}")


app.run()

Function calling

import json

from azure.ai.agentserver.responses import ResponseEventStream

stream = ResponseEventStream(response_id=context.response_id, request=request)
yield stream.emit_created()
yield stream.emit_in_progress()

arguments = json.dumps({"location": "Seattle", "unit": "fahrenheit"})
yield from stream.output_item_function_call("get_weather", "call_001", arguments)

yield stream.emit_completed()

Reasoning + text message

stream = ResponseEventStream(response_id=context.response_id, request=request)
yield stream.emit_created()
yield stream.emit_in_progress()

yield from stream.output_item_reasoning_item("Let me think about this...")
yield from stream.output_item_message("Here is my answer.")

yield stream.emit_completed()

Configuration

from azure.ai.agentserver.responses import ResponsesAgentServerHost, ResponsesServerOptions

options = ResponsesServerOptions(
    default_model="gpt-4o",
    sse_keep_alive_interval_seconds=15,
    shutdown_grace_period_seconds=10,
)

app = ResponsesAgentServerHost(options=options)

Troubleshooting

Common errors

  • 400 Bad Request: The request body failed validation. Check that optional fields such as model (when provided) are valid and that input items are well-formed.
  • 404 Not Found: The response ID does not exist or has expired past the configured TTL.
  • 400 Bad Request (cancel): The response was not created with background=true, or it has already reached a terminal state.

Reporting issues

To report an issue with the client library, or request additional features, please open a GitHub issue here.

Next steps

Visit the Samples folder for complete working examples:

Sample Description
Getting Started Minimal echo handler using TextResponse
Streaming Text Deltas Token-by-token streaming with configure callback
Full Control Convenience, streaming, and builder — three ways to emit output
Function Calling Two-turn function calling with convenience and builder variants
Conversation History Multi-turn study tutor with context.get_history()
Multi-Output Reasoning + message in a single response
Streaming Upstream Forward to upstream streaming LLM via openai SDK
Non-Streaming Upstream Forward to upstream non-streaming LLM, emit items via builders
Image Generation Image gen convenience, streaming partials, and full-control builder
Image Input Receive images via URL, base64 data URL, or file ID
File Inputs Receive files via base64 data URL, URL, or file ID
Annotations Attach file_path, file_citation, and url_citation annotations
Structured Outputs Return structured JSON as a structured_outputs item

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information, see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azure_ai_agentserver_responses-1.0.0b5.tar.gz (410.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

azure_ai_agentserver_responses-1.0.0b5-py3-none-any.whl (268.4 kB view details)

Uploaded Python 3

File details

Details for the file azure_ai_agentserver_responses-1.0.0b5.tar.gz.

File metadata

File hashes

Hashes for azure_ai_agentserver_responses-1.0.0b5.tar.gz
Algorithm Hash digest
SHA256 0b627b810359c792ea7b6fa6782abaf6df32d9bc9e5a569ad722afcffd0ce8d9
MD5 6f73ab2ea90669736d1db3d67b92516f
BLAKE2b-256 e6273ecb7fe704ff8764199bfbe4cc1e584a520a9affe042470d9d50b6e1e73a

See more details on using hashes here.

File details

Details for the file azure_ai_agentserver_responses-1.0.0b5-py3-none-any.whl.

File metadata

File hashes

Hashes for azure_ai_agentserver_responses-1.0.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 4c2a6ab56e71eeb330aa52b7cb2cc71b8ec6b5bbe0e7dc84310f2c7fbda393a3
MD5 f4d14fdf1cbae87f142e7cfe980d12d5
BLAKE2b-256 44911e5c0d7ce95ca8b022e69e4ca6b23e413fc2d57f0191429c4633e02213d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page