Python SDK for building servers implementing the Azure AI Responses protocol
Project description
Azure AI Agent Server Responses client library for Python
The azure-ai-agentserver-responses package provides the Responses protocol endpoints for Azure AI Hosted Agent containers. It plugs into the azure-ai-agentserver-core host framework and adds the full response lifecycle: create, stream (SSE), cancel, delete, replay, and input-item listing.
Getting started
Install the package
pip install azure-ai-agentserver-responses
This automatically installs azure-ai-agentserver-core as a dependency.
Prerequisites
- Python 3.10 or later
Key concepts
ResponsesAgentServerHost
ResponsesAgentServerHost is an AgentServerHost subclass that adds Responses protocol endpoints. Register your handler with the @app.response_handler decorator:
@app.response_handler
def my_handler(
request: CreateResponse, context: ResponseContext, cancellation_signal: asyncio.Event
):
...
Protocol endpoints
| Method | Route | Description |
|---|---|---|
POST |
/responses |
Create a new response |
GET |
/responses/{response_id} |
Get response state (JSON or SSE replay via ?stream=true) |
POST |
/responses/{response_id}/cancel |
Cancel an in-flight response |
DELETE |
/responses/{response_id} |
Delete a stored response |
GET |
/responses/{response_id}/input_items |
List input items (paginated) |
TextResponse
The simplest way to return text. Handles the full SSE lifecycle automatically (response.created → response.in_progress → message/content events → response.completed):
return TextResponse(context, request, text="Hello!")
For streaming, pass an async iterable to text:
async def tokens():
for t in ["Hello", ", ", "world!"]:
yield t
return TextResponse(context, request, text=tokens())
ResponseEventStream
Use ResponseEventStream when you need function calls, reasoning items, multiple output types, or fine-grained event control. Each yield maps 1:1 to an SSE event with zero bookkeeping:
stream = ResponseEventStream(response_id=context.response_id, request=request)
yield stream.emit_created()
yield stream.emit_in_progress()
yield from stream.output_item_message("Hello, world!")
yield stream.emit_completed()
Drop down to the builder API for full control over individual events:
message = stream.add_output_item_message()
yield message.emit_added()
text = message.add_text_content()
yield text.emit_added()
yield text.emit_delta("Hello!")
yield text.emit_text_done()
yield text.emit_done()
yield message.emit_done()
ResponseContext
The ResponseContext provides request-scoped state:
| Property / Method | Description |
|---|---|
response_id |
Unique ID for this response |
is_shutdown_requested |
Whether the server is draining |
isolation |
IsolationContext with user_key and chat_key for multi-tenant state partitioning |
client_headers |
Dictionary of x-client-* headers forwarded from the platform (keys normalized to lowercase) |
query_parameters |
Dictionary of query string parameters |
get_input_items() |
Load resolved input items as Item subtypes |
get_input_text() |
Extract all text content from input items as a single string |
get_history() |
Load conversation history items |
Streaming and background modes
The SDK automatically handles all combinations of stream and background flags:
- Default — Run to completion, return final JSON response
- Streaming — Pipe events as SSE in real-time, cancel on client disconnect
- Background — Return immediately, handler runs in the background
- Streaming + Background — SSE while connected, handler continues after disconnect
Response lifecycle
The library orchestrates the complete response lifecycle: created → in_progress → completed (or failed / cancelled). Cancellation, error handling, and terminal event guarantees are all managed automatically.
For detailed handler implementation guidance, see docs/handler-implementation-guide.md.
Examples
Echo handler
import asyncio
from azure.ai.agentserver.responses import (
CreateResponse,
ResponseContext,
ResponsesAgentServerHost,
TextResponse,
)
app = ResponsesAgentServerHost()
@app.response_handler
async def handler(request: CreateResponse, context: ResponseContext, cancellation_signal: asyncio.Event):
text = await context.get_input_text()
return TextResponse(context, request, text=f"Echo: {text}")
app.run()
Function calling
import json
from azure.ai.agentserver.responses import ResponseEventStream
stream = ResponseEventStream(response_id=context.response_id, request=request)
yield stream.emit_created()
yield stream.emit_in_progress()
arguments = json.dumps({"location": "Seattle", "unit": "fahrenheit"})
yield from stream.output_item_function_call("get_weather", "call_001", arguments)
yield stream.emit_completed()
Reasoning + text message
stream = ResponseEventStream(response_id=context.response_id, request=request)
yield stream.emit_created()
yield stream.emit_in_progress()
yield from stream.output_item_reasoning_item("Let me think about this...")
yield from stream.output_item_message("Here is my answer.")
yield stream.emit_completed()
Configuration
from azure.ai.agentserver.responses import ResponsesAgentServerHost, ResponsesServerOptions
options = ResponsesServerOptions(
default_model="gpt-4o",
sse_keep_alive_interval_seconds=15,
shutdown_grace_period_seconds=10,
)
app = ResponsesAgentServerHost(options=options)
Troubleshooting
Common errors
- 400 Bad Request: The request body failed validation. Check that optional fields such as
model(when provided) are valid and thatinputitems are well-formed. - 404 Not Found: The response ID does not exist or has expired past the configured TTL.
- 400 Bad Request (cancel): The response was not created with
background=true, or it has already reached a terminal state.
Reporting issues
To report an issue with the client library, or request additional features, please open a GitHub issue here.
Next steps
Visit the Samples folder for complete working examples:
| Sample | Description |
|---|---|
| Getting Started | Minimal echo handler using TextResponse |
| Streaming Text Deltas | Token-by-token streaming with configure callback |
| Full Control | Convenience, streaming, and builder — three ways to emit output |
| Function Calling | Two-turn function calling with convenience and builder variants |
| Conversation History | Multi-turn study tutor with context.get_history() |
| Multi-Output | Reasoning + message in a single response |
| Streaming Upstream | Forward to upstream streaming LLM via openai SDK |
| Non-Streaming Upstream | Forward to upstream non-streaming LLM, emit items via builders |
| Image Generation | Image gen convenience, streaming partials, and full-control builder |
| Image Input | Receive images via URL, base64 data URL, or file ID |
| File Inputs | Receive files via base64 data URL, URL, or file ID |
| Annotations | Attach file_path, file_citation, and url_citation annotations |
| Structured Outputs | Return structured JSON as a structured_outputs item |
- Handler implementation guide — Detailed reference for building handlers
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information, see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file azure_ai_agentserver_responses-1.0.0b1.tar.gz.
File metadata
- Download URL: azure_ai_agentserver_responses-1.0.0b1.tar.gz
- Upload date:
- Size: 361.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: RestSharp/106.13.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8af679bc0369f3e2637348b571942b5b41a1bccc5f261d965211e590a191049b
|
|
| MD5 |
8b5752fec4831d1be38c7d458aec399a
|
|
| BLAKE2b-256 |
cf689a500cf1869d416810c3d856e483d707890686124adf0d35ff5179f59a4a
|
File details
Details for the file azure_ai_agentserver_responses-1.0.0b1-py3-none-any.whl.
File metadata
- Download URL: azure_ai_agentserver_responses-1.0.0b1-py3-none-any.whl
- Upload date:
- Size: 253.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: RestSharp/106.13.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d42534b5b6e523219e92d837efde48f76e03882f7013d1fbb4af21e47c6469b
|
|
| MD5 |
335e3c5a1bb96a6032afc628afbd27ba
|
|
| BLAKE2b-256 |
41e8cf0ac9673b9d8952fd03ee3de9d6d3d51320942ae31b72cdc503f9892d2f
|