Azure Content Understanding integration for Microsoft Agent Framework.
Project description
Get Started with Azure Content Understanding in Microsoft Agent Framework
Please install this package via pip:
pip install agent-framework-azure-contentunderstanding --pre
Azure Content Understanding Integration
Prerequisites
Before using this package, you need an Azure Content Understanding resource:
- An active Azure subscription (create one for free)
- A Microsoft Foundry resource created in a supported region
- Default model deployments configured for your resource (GPT-4.1, GPT-4.1-mini, text-embedding-3-large)
Follow the prerequisites section in the Azure Content Understanding quickstart for setup instructions.
Introduction
The Azure Content Understanding integration provides a context provider that automatically analyzes file attachments (documents, images, audio, video) using Azure Content Understanding and injects structured results into the LLM context.
- Document & image analysis: State-of-the-art OCR with markdown extraction, table preservation, and structured field extraction — handles scanned PDFs, handwritten content, and complex layouts
- Audio & video analysis: Transcription, speaker diarization, and per-segment summaries
- Background processing: Configurable timeout with async background fallback for large files
- file_search integration: Optional vector store upload for token-efficient RAG on large documents
Learn more about Azure Content Understanding capabilities at https://learn.microsoft.com/azure/ai-services/content-understanding/
Basic Usage Example
See the samples directory which demonstrates:
- Single PDF upload and Q&A (01_document_qa)
- Multi-turn sessions with cached results (02_multi_turn_session)
- PDF + audio + video parallel analysis (03_multimodal_chat)
- Structured field extraction with prebuilt-invoice (04_invoice_processing)
- CU extraction + OpenAI vector store RAG (05_large_doc_file_search)
- Interactive web UI with DevUI (02-devui)
import asyncio
from agent_framework import Agent, AgentSession, Message, Content
from agent_framework.foundry import FoundryChatClient
from agent_framework.foundry import ContentUnderstandingContextProvider
from azure.identity import AzureCliCredential
credential = AzureCliCredential()
cu = ContentUnderstandingContextProvider(
endpoint="https://my-resource.cognitiveservices.azure.com/",
credential=credential,
max_wait=None, # block until CU extraction completes before sending to LLM
)
client = FoundryChatClient(
project_endpoint="https://your-project.services.ai.azure.com",
model="gpt-4.1",
credential=credential,
)
async def main():
async with cu:
agent = Agent(
client=client,
name="DocumentQA",
instructions="You are a helpful document analyst.",
context_providers=[cu],
)
session = AgentSession()
response = await agent.run(
Message(role="user", contents=[
Content.from_text("What's on this invoice?"),
Content.from_uri(
"https://raw.githubusercontent.com/Azure-Samples/"
"azure-ai-content-understanding-assets/main/document/invoice.pdf",
media_type="application/pdf",
additional_properties={"filename": "invoice.pdf"},
),
]),
session=session,
)
print(response.text)
asyncio.run(main())
Supported File Types
| Category | Types |
|---|---|
| Documents | PDF, DOCX, XLSX, PPTX, HTML, TXT, Markdown |
| Images | JPEG, PNG, TIFF, BMP |
| Audio | WAV, MP3, M4A, FLAC, OGG |
| Video | MP4, MOV, AVI, WebM |
For the complete list of supported file types and size limits, see Azure Content Understanding service limits.
Environment Variables
The provider supports automatic endpoint resolution from environment variables.
When endpoint is not passed to the constructor, it is loaded from
AZURE_CONTENTUNDERSTANDING_ENDPOINT:
# Endpoint auto-loaded from AZURE_CONTENTUNDERSTANDING_ENDPOINT env var
cu = ContentUnderstandingContextProvider(credential=credential)
Set these in your shell or in a .env file:
AZURE_CONTENTUNDERSTANDING_ENDPOINT=https://your-cu-resource.cognitiveservices.azure.com/
AZURE_AI_PROJECT_ENDPOINT=https://your-project.services.ai.azure.com
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4.1
You also need to be logged in with az login (for AzureCliCredential).
Next steps
- Explore the samples directory for complete code examples
- Read the Azure Content Understanding documentation for detailed service information
- Learn more about the Microsoft Agent Framework
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_framework_azure_contentunderstanding-1.0.0a260429.tar.gz.
File metadata
- Download URL: agent_framework_azure_contentunderstanding-1.0.0a260429.tar.gz
- Upload date:
- Size: 23.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02a0a4a546b46ca4f2885b97d1d0aa24fb4d8056dc03f6d45d9d7e6da9aebc12
|
|
| MD5 |
72bb95d2f65468281353c732ab0162e4
|
|
| BLAKE2b-256 |
d2947f6344702e76367d66b00d74c49557bc79497684cf8bc8f7e960efebd27f
|
File details
Details for the file agent_framework_azure_contentunderstanding-1.0.0a260429-py3-none-any.whl.
File metadata
- Download URL: agent_framework_azure_contentunderstanding-1.0.0a260429-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4aeb92dc48a512268c002e6c9e59bf41233f346a039827571d0928d4eb761eba
|
|
| MD5 |
34d036cac4133471a36e6aa70b7fc08f
|
|
| BLAKE2b-256 |
549ca9f976130e91851f182f3b92306ecdb8239e540d9641c693e88460de5c1b
|