Python SDK for Azure Cortex ML API
Project description
Azure Cortex SDK
A Python SDK for interacting with the Azure Cortex ML API, providing seamless integration for machine learning operations including environment management, model versioning, inference handling, and conversation management.
Features
Core ML Operations
- Environment Management: Create, update, and manage Azure ML environments with conda dependencies
- Environment Versioning: Register, archive, and set default environment versions
- Model Management: Upload, register, and manage ML models with metadata and tags
- Model Versioning: Register model versions from files or buffers with version control
- Endpoint Management: Create and manage online endpoints for model serving
- Deployment Operations: Deploy models with traffic allocation, scaling, and blue-green deployments
- Scoring Scripts: Manage Python scoring scripts for custom inference logic
- Secrets Management: Securely manage Azure Key Vault secrets for API keys and credentials
Inference & LLM
- Direct Inference: Perform real-time inference on deployed ML endpoints
- LLM Configuration: Configure OpenAI, Azure OpenAI, and other LLM providers
- LLM Inference: Perform chat completions with streaming support
- Message Management: Create and manage conversation messages with auto-inference
- Conversation Management: Handle multi-turn conversations with escalation and resolution
- Inference Tracking: List, filter, and annotate all inference requests
Developer Experience
- Authentication: OAuth2 client credentials flow with automatic token refresh
- Type Safety: Full type hints and Pydantic models for request/response validation
- Error Handling: Comprehensive exception handling with retry mechanisms
- File Operations: Support for large file uploads with streaming and buffer support
- Test Data Management: Built-in utilities for creating and cleaning up test data
- Pagination: Automatic pagination support for all list operations
- Filtering: Advanced filtering options for conversations, messages, and inferences
Installation
pip install az-cortex-sdk
Quick Start
from az_cortex_sdk import CortexClient
# Initialize client from environment variables
client = CortexClient.from_env()
# Or initialize with explicit credentials
client = CortexClient(
base_url="https://api.az-dev.nearlyhuman.ai",
client_id="your-client-id",
client_secret="your-client-secret",
tenant="demo"
)
# Check API health
health = client.health_check()
print(f"API Status: {health['status']}")
# List environments
environments = client.environments.list()
print(f"Found {len(environments.items)} environments")
# List models
models = client.models.list()
for model in models.items:
print(f"Model: {model.name}")
# Perform LLM inference
response = client.llm.infer(
config_name="my-llm-config",
messages=[{"role": "user", "content": "Hello!"}]
)
print(f"Response: {response.content}")
Configuration
Set environment variables:
export CORTEX_API_BASE_URL="https://api.az-dev.nearlyhuman.ai"
export CORTEX_CLIENT_ID="your-client-id"
export CORTEX_CLIENT_SECRET="your-client-secret"
export CORTEX_TENANT="demo"
Documentation
Getting Started
- Getting Started Guide - Comprehensive setup and usage guide
- Examples - Practical usage examples
Core Guides
- API Guide - Endpoints, deployments, inference, messages, and conversations
- LLM Guide - LLM configuration, inference, and streaming
- Version Management - Environment and model version management
- Secrets Management - Azure Key Vault secrets management
Reference
- API Reference - Complete API documentation below
Examples
Environment & Model Management
# Create and manage environments
client.environments.create(
name="my-env",
image="mcr.microsoft.com/azureml/minimal-ubuntu20.04-py38-cpu-inference:latest",
conda_file="path/to/conda.yml",
description="Production environment"
)
# Register environment version
client.environment_versions.register(
environment_name="my-env",
version="1.0.0",
image="mcr.microsoft.com/azureml/minimal-ubuntu20.04-py38-cpu-inference:latest"
)
# Upload and create model
client.models.upload_and_create(
file_path="./model.pkl",
name="my-model",
description="My ML model",
tags={"framework": "sklearn"}
)
# Register model version from file
client.model_versions.register_from_file(
file_path="./model.pkl",
model_name="my-model",
version="1.0.0",
description="Initial release"
)
Endpoint & Deployment
# Create endpoint
client.endpoints.create(name="my-endpoint")
# Create deployment
client.deployments.create(
endpoint_name="my-endpoint",
name="blue",
model_name="my-model",
model_version="1.0.0",
environment_name="my-env",
environment_version="1.0.0",
instance_type="Standard_DS3_v2",
scoring_script_name="score.py",
instance_count=1,
traffic_percentage=100
)
# Update traffic allocation (blue-green deployment)
client.deployments.update_traffic(
endpoint_name="my-endpoint",
traffic_allocation={"blue": 90, "green": 10}
)
# Perform inference on endpoint
result = client.endpoints.infer(
name="my-endpoint",
inputs={"data": [[1, 2, 3, 4]]}
)
LLM Configuration & Inference
# Create LLM configuration for Azure OpenAI
client.llm.create(
config_name="gpt4-prod",
provider="azure_openai",
model="gpt-4",
api_key="your-api-key",
endpoint="https://your-resource.openai.azure.com",
deployment_name="gpt-4-deployment",
default_parameters={
"temperature": 0.7,
"max_tokens": 500
}
)
# Perform LLM inference
response = client.llm.infer(
config_name="gpt4-prod",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is machine learning?"}
]
)
print(response.content)
# Stream LLM inference
for event in client.llm.infer_stream("gpt4-prod", messages):
if event['type'] == 'chunk':
print(event['content'], end='', flush=True)
elif event['type'] == 'done':
print(f"\nTokens used: {event['usage']}")
Messages & Conversations
# Create message with auto-inference
message = client.messages.create(
endpoint_name="my-endpoint",
text="Hello, I need help!",
email="user@example.com",
llm_config_name="gpt4-prod",
auto_infer=True
)
# List conversations with filters
conversations = client.conversations.list(
endpoint_name="my-endpoint",
is_escalated=False,
is_resolved=False
)
# Escalate conversation
client.conversations.escalate(
id=conversation_id,
escalation_routes=["support@example.com"]
)
# Resolve conversation
client.conversations.resolve(id=conversation_id)
Inference Management
# Perform direct inference
result = client.inferences.infer(
endpoint_name="my-endpoint",
inputs={"messages": [{"role": "user", "content": "Hello!"}]},
tenant_key="customer-123"
)
# List and filter inferences
inferences = client.inferences.list(
endpoint_name="my-endpoint",
has_annotation=False,
tenant_key="customer-123"
)
# Update inference with annotation
client.inferences.update(
id=inference_id,
annotation={"quality": "good", "category": "support"}
)
Secrets & Scoring Scripts
# Manage Azure Key Vault secrets
client.secrets.create(name="api-key", value="secret-value")
client.secrets.update(name="api-key", value="new-value")
secrets = client.secrets.list()
# Create scoring script from file
client.scorings.create_from_file(
name="score.py",
file_path="./scoring/score.py"
)
# List scoring scripts
scripts = client.scorings.list()
Test Data Management
from az_cortex_sdk import TestDataManager
test_manager = TestDataManager(client)
# Create test messages (automatically marked as test-generated)
test_message = test_manager.create_test_message(
endpoint_name="my-endpoint",
text="Test message",
email="test@example.com"
)
# List only test data
test_conversations = client.conversations.list_test_conversations()
test_messages = client.messages.list_test_messages()
test_inferences = client.inferences.list_test_inferences()
# Clean up test data
cleanup_stats = test_manager.cleanup_all_test_data()
print(f"Cleaned up {cleanup_stats['total_deleted']} test items")
API Reference
Core Services
The SDK provides access to all Azure Cortex ML API endpoints through service classes:
Environment Management
client.environments - Manage Azure ML environments
| Method | Description | Parameters |
|---|---|---|
list(page, per_page) |
List all environments | page (int), per_page (int) |
get(name) |
Get environment by name | name (str) |
create(name, image, conda_file, description) |
Create new environment | name, image, conda_file, description |
update(name, image, conda_file, description) |
Update environment | name, image, conda_file, description |
delete(name) |
Delete environment | name (str) |
client.environment_versions - Manage environment versions
| Method | Description | Parameters |
|---|---|---|
list(environment_name, page, per_page) |
List versions for environment | environment_name, page, per_page |
register(environment_name, version, image, conda_file, description) |
Register new version | environment_name, version, image, conda_file, description |
archive(environment_name, version) |
Archive a version | environment_name, version |
set_default(environment_name, version) |
Set default version | environment_name, version |
Model Management
client.models - Manage Azure ML models
| Method | Description | Parameters |
|---|---|---|
list(page, per_page) |
List all models | page (int), per_page (int) |
get(name) |
Get model by name | name (str) |
create(name, description, tags) |
Create new model | name, description, tags |
upload_and_create(file_path, name, description, tags) |
Upload file and create model | file_path, name, description, tags |
upload_buffer_and_create(buffer, filename, name, description, tags) |
Upload buffer and create model | buffer, filename, name, description, tags |
delete(name) |
Delete model | name (str) |
client.model_versions - Manage model versions
| Method | Description | Parameters |
|---|---|---|
list(model_name, page, per_page) |
List versions for model | model_name, page, per_page |
register_from_file(file_path, model_name, version, description, tags) |
Register version from file | file_path, model_name, version, description, tags |
register_from_buffer(buffer, filename, model_name, version, description, tags) |
Register version from buffer | buffer, filename, model_name, version, description, tags |
archive(model_name, version) |
Archive a version | model_name, version |
set_default(model_name, version) |
Set default version | model_name, version |
Endpoint & Deployment Management
client.endpoints - Manage online endpoints
| Method | Description | Parameters |
|---|---|---|
list(page, per_page) |
List all endpoints | page (int), per_page (int) |
get(name) |
Get endpoint by name | name (str) |
create(name) |
Create new endpoint | name (str) |
update(name, identity) |
Update endpoint identity | name, identity (EndpointIdentity) |
delete(name) |
Delete endpoint | name (str) |
infer(name, inputs) |
Perform inference on endpoint | name, inputs (dict) |
client.deployments - Manage endpoint deployments
| Method | Description | Parameters |
|---|---|---|
list(endpoint_name, page, per_page) |
List deployments for endpoint | endpoint_name, page, per_page |
get(name, endpoint_name) |
Get deployment by name | name, endpoint_name |
create(endpoint_name, name, model_name, model_version, environment_name, environment_version, instance_type, scoring_script_name, instance_count, traffic_percentage) |
Create deployment | See deployment parameters |
delete(name, endpoint_name) |
Delete deployment | name, endpoint_name |
update_traffic(endpoint_name, traffic_allocation) |
Update traffic allocation | endpoint_name, traffic_allocation (dict) |
Inference & Messaging
client.inferences - Manage ML inferences
| Method | Description | Parameters |
|---|---|---|
list(endpoint_name, page, per_page, has_annotation, is_test_generated, ...) |
List inferences with filters | Multiple filter parameters |
get(id) |
Get inference by ID | id (str) |
update(id, annotation, escalation_routes) |
Update inference | id, annotation, escalation_routes |
infer(endpoint_name, inputs, tenant_key, ip_address, is_test_generated) |
Perform direct inference | endpoint_name, inputs, optional params |
list_test_inferences(...) |
List only test-generated inferences | Same as list() |
client.messages - Manage conversation messages
| Method | Description | Parameters |
|---|---|---|
list(endpoint_name, conversation_id, page, per_page, is_test_generated, ...) |
List messages with filters | Multiple filter parameters |
get(id) |
Get message by ID | id (str) |
create(endpoint_name, text, email, tenant_key, llm_config_name, conversation_id, ip_address, auto_infer, is_test_generated) |
Create message | endpoint_name, text, email, optional params |
update(id, annotation, escalation_routes) |
Update message | id, annotation, escalation_routes |
list_test_messages(...) |
List only test-generated messages | Same as list() |
client.conversations - Manage conversations
| Method | Description | Parameters |
|---|---|---|
list(endpoint_name, page, per_page, is_escalated, is_resolved, is_test_generated, ...) |
List conversations with filters | Multiple filter parameters |
get(id) |
Get conversation by ID | id (str) |
update(id, is_escalated, is_resolved, annotation, escalation_routes) |
Update conversation | id, optional params |
escalate(id, escalation_routes) |
Escalate conversation | id, escalation_routes |
resolve(id) |
Resolve conversation | id (str) |
list_test_conversations(...) |
List only test-generated conversations | Same as list() |
LLM Configuration & Inference
client.llm - Configure and use LLM providers
| Method | Description | Parameters |
|---|---|---|
list(page, per_page, is_archived) |
List LLM configurations | page, per_page, is_archived |
get(config_name) |
Get LLM config by name | config_name (str) |
create(config_name, provider, model, api_key, endpoint, deployment_name, default_parameters) |
Create LLM config | See LLM parameters |
update(config_name, provider, model, api_key, endpoint, deployment_name, default_parameters) |
Update LLM config | Same as create() |
archive(config_name) |
Archive LLM config | config_name (str) |
delete(config_name) |
Delete LLM config | config_name (str) |
infer(config_name, messages, parameters, is_test_generated) |
Perform LLM inference | config_name, messages, optional params |
infer_stream(config_name, messages, parameters, is_test_generated) |
Stream LLM inference | Same as infer() |
Secrets & Scoring Scripts
client.secrets - Manage Azure Key Vault secrets
| Method | Description | Parameters |
|---|---|---|
list(page, per_page) |
List all secrets | page (int), per_page (int) |
create(name, value) |
Create new secret | name, value |
update(name, value) |
Update secret value | name, value |
delete(name) |
Delete secret | name (str) |
client.scorings - Manage scoring scripts
| Method | Description | Parameters |
|---|---|---|
list(page, per_page) |
List all scoring scripts | page (int), per_page (int) |
get(name) |
Get scoring script by name | name (str) |
create(name, content) |
Create scoring script | name, content |
create_from_file(name, file_path) |
Create from file | name, file_path |
Test Data Management
The SDK includes built-in support for test data management:
- Automatic Filtering: By default, all
list()methods hide test-generated data - Test Data Creation: Use
TestDataManagerto create test data marked withis_test_generated=True - Convenience Methods: Use
list_test_*()methods to view only test data - Cleanup Utilities: Built-in methods to clean up test data after testing
| Class | Description | Key Methods |
|---|---|---|
TestDataManager |
Test data utilities | create_test_message(), cleanup_all_test_data(), count_test_data() |
Test Data Filtering Options
All list methods support the is_test_generated parameter:
None(default): Hide test-generated objects'false': Explicitly hide test-generated objects'true': Show only test-generated objects'all': Show all objects regardless of test generation status
Development
# Install development dependencies
pip install -e .[dev]
# Run unit tests only (no live API calls)
make test
# Run unit tests with coverage
make test-cov
# Run integration tests (requires live API access)
make test-integration
# Run all tests
make test-all
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file az_cortex_sdk-0.2.1.tar.gz.
File metadata
- Download URL: az_cortex_sdk-0.2.1.tar.gz
- Upload date:
- Size: 85.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b82109e003dd673fa5b3ada260720959b3be8dfc849f4c66684b3234b390b494
|
|
| MD5 |
07667e0fe49f6b5d9af9144f6e4e1c3b
|
|
| BLAKE2b-256 |
d29f2c9ca69f983e52db5a26589ae9dbf247b73252d2ead304f6792313ce524b
|
File details
Details for the file az_cortex_sdk-0.2.1-py3-none-any.whl.
File metadata
- Download URL: az_cortex_sdk-0.2.1-py3-none-any.whl
- Upload date:
- Size: 67.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2436d12ad52526c5fc44067fefe6d901701bddde2e5af5bc2ac057a189f4450c
|
|
| MD5 |
363d40ee6e4271d097dd918694737190
|
|
| BLAKE2b-256 |
6611b799385f669eb3ad478505ce9213f041c7dcfde3995d34ef330c4c5e2ec2
|