Python SDK for Azure Cortex ML API

These details have not been verified by PyPI

Project links

Project description

Azure Cortex SDK

A Python SDK for interacting with the Azure Cortex ML API, providing seamless integration for machine learning operations including environment management, model versioning, inference handling, and conversation management.

Features

Core ML Operations

Environment Management: Create, update, and manage Azure ML environments with conda dependencies
Environment Versioning: Register, archive, and set default environment versions
Model Management: Upload, register, and manage ML models with metadata and tags
Model Versioning: Register model versions from files or buffers with version control
Endpoint Management: Create and manage online endpoints for model serving
Deployment Operations: Deploy models with traffic allocation, scaling, and blue-green deployments
Scoring Scripts: Manage Python scoring scripts for custom inference logic
Secrets Management: Securely manage Azure Key Vault secrets for API keys and credentials

Inference & LLM

Direct Inference: Perform real-time inference on deployed ML endpoints
LLM Configuration: Configure OpenAI, Azure OpenAI, and other LLM providers
LLM Inference: Perform chat completions with streaming support
Message Management: Create and manage conversation messages with auto-inference
Conversation Management: Handle multi-turn conversations with escalation and resolution
Inference Tracking: List, filter, and annotate all inference requests

Developer Experience

Authentication: OAuth2 client credentials flow with automatic token refresh
Type Safety: Full type hints and Pydantic models for request/response validation
Error Handling: Comprehensive exception handling with retry mechanisms
File Operations: Support for large file uploads with streaming and buffer support
Test Data Management: Built-in utilities for creating and cleaning up test data
Pagination: Automatic pagination support for all list operations
Filtering: Advanced filtering options for conversations, messages, and inferences

Installation

pip install az-cortex-sdk

Quick Start

from az_cortex_sdk import CortexClient

# Initialize client from environment variables
client = CortexClient.from_env()

# Or initialize with explicit credentials
client = CortexClient(
    base_url="https://api.az-dev.nearlyhuman.ai",
    client_id="your-client-id",
    client_secret="your-client-secret",
    tenant="demo"
)

# Check API health
health = client.health_check()
print(f"API Status: {health['status']}")

# List environments
environments = client.environments.list()
print(f"Found {len(environments.items)} environments")

# List models
models = client.models.list()
for model in models.items:
    print(f"Model: {model.name}")

# Perform LLM inference
response = client.llm.infer(
    config_name="my-llm-config",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(f"Response: {response.content}")

Configuration

Set environment variables:

export CORTEX_API_BASE_URL="https://api.az-dev.nearlyhuman.ai"
export CORTEX_CLIENT_ID="your-client-id"
export CORTEX_CLIENT_SECRET="your-client-secret"
export CORTEX_TENANT="demo"

Documentation

Getting Started

Getting Started Guide - Comprehensive setup and usage guide
Examples - Practical usage examples

Core Guides

API Guide - Endpoints, deployments, inference, messages, and conversations
LLM Guide - LLM configuration, inference, and streaming
Version Management - Environment and model version management
Secrets Management - Azure Key Vault secrets management

Reference

API Reference - Complete API documentation below

Examples

Environment & Model Management

# Create and manage environments
client.environments.create(
    name="my-env",
    image="mcr.microsoft.com/azureml/minimal-ubuntu20.04-py38-cpu-inference:latest",
    conda_file="path/to/conda.yml",
    description="Production environment"
)

# Register environment version
client.environment_versions.register(
    environment_name="my-env",
    version="1.0.0",
    image="mcr.microsoft.com/azureml/minimal-ubuntu20.04-py38-cpu-inference:latest"
)

# Upload and create model
client.models.upload_and_create(
    file_path="./model.pkl",
    name="my-model",
    description="My ML model",
    tags={"framework": "sklearn"}
)

# Register model version from file
client.model_versions.register_from_file(
    file_path="./model.pkl",
    model_name="my-model",
    version="1.0.0",
    description="Initial release"
)

Endpoint & Deployment

# Create endpoint
client.endpoints.create(name="my-endpoint")

# Create deployment
client.deployments.create(
    endpoint_name="my-endpoint",
    name="blue",
    model_name="my-model",
    model_version="1.0.0",
    environment_name="my-env",
    environment_version="1.0.0",
    instance_type="Standard_DS3_v2",
    scoring_script_name="score.py",
    instance_count=1,
    traffic_percentage=100
)

# Update traffic allocation (blue-green deployment)
client.deployments.update_traffic(
    endpoint_name="my-endpoint",
    traffic_allocation={"blue": 90, "green": 10}
)

# Perform inference on endpoint
result = client.endpoints.infer(
    name="my-endpoint",
    inputs={"data": [[1, 2, 3, 4]]}
)

LLM Configuration & Inference

# Create LLM configuration for Azure OpenAI
client.llm.create(
    config_name="gpt4-prod",
    provider="azure_openai",
    model="gpt-4",
    api_key="your-api-key",
    endpoint="https://your-resource.openai.azure.com",
    deployment_name="gpt-4-deployment",
    default_parameters={
        "temperature": 0.7,
        "max_tokens": 500
    }
)

# Perform LLM inference
response = client.llm.infer(
    config_name="gpt4-prod",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is machine learning?"}
    ]
)
print(response.content)

# Stream LLM inference
for event in client.llm.infer_stream("gpt4-prod", messages):
    if event['type'] == 'chunk':
        print(event['content'], end='', flush=True)
    elif event['type'] == 'done':
        print(f"\nTokens used: {event['usage']}")

Messages & Conversations

# Create message with auto-inference
message = client.messages.create(
    endpoint_name="my-endpoint",
    text="Hello, I need help!",
    email="user@example.com",
    llm_config_name="gpt4-prod",
    auto_infer=True
)

# List conversations with filters
conversations = client.conversations.list(
    endpoint_name="my-endpoint",
    is_escalated=False,
    is_resolved=False
)

# Escalate conversation
client.conversations.escalate(
    id=conversation_id,
    escalation_routes=["support@example.com"]
)

# Resolve conversation
client.conversations.resolve(id=conversation_id)

Inference Management

# Perform direct inference
result = client.inferences.infer(
    endpoint_name="my-endpoint",
    inputs={"messages": [{"role": "user", "content": "Hello!"}]},
    tenant_key="customer-123"
)

# List and filter inferences
inferences = client.inferences.list(
    endpoint_name="my-endpoint",
    has_annotation=False,
    tenant_key="customer-123"
)

# Update inference with annotation
client.inferences.update(
    id=inference_id,
    annotation={"quality": "good", "category": "support"}
)

Secrets & Scoring Scripts

# Manage Azure Key Vault secrets
client.secrets.create(name="api-key", value="secret-value")
client.secrets.update(name="api-key", value="new-value")
secrets = client.secrets.list()

# Create scoring script from file
client.scorings.create_from_file(
    name="score.py",
    file_path="./scoring/score.py"
)

# List scoring scripts
scripts = client.scorings.list()

Test Data Management

from az_cortex_sdk import TestDataManager

test_manager = TestDataManager(client)

# Create test messages (automatically marked as test-generated)
test_message = test_manager.create_test_message(
    endpoint_name="my-endpoint",
    text="Test message",
    email="test@example.com"
)

# List only test data
test_conversations = client.conversations.list_test_conversations()
test_messages = client.messages.list_test_messages()
test_inferences = client.inferences.list_test_inferences()

# Clean up test data
cleanup_stats = test_manager.cleanup_all_test_data()
print(f"Cleaned up {cleanup_stats['total_deleted']} test items")

API Reference

Core Services

The SDK provides access to all Azure Cortex ML API endpoints through service classes:

Environment Management

client.environments - Manage Azure ML environments

Method	Description	Parameters
`list(page, per_page)`	List all environments	`page` (int), `per_page` (int)
`get(name)`	Get environment by name	`name` (str)
`create(name, image, conda_file, description)`	Create new environment	`name`, `image`, `conda_file`, `description`
`update(name, image, conda_file, description)`	Update environment	`name`, `image`, `conda_file`, `description`
`delete(name)`	Delete environment	`name` (str)

client.environment_versions - Manage environment versions

Method	Description	Parameters
`list(environment_name, page, per_page)`	List versions for environment	`environment_name`, `page`, `per_page`
`register(environment_name, version, image, conda_file, description)`	Register new version	`environment_name`, `version`, `image`, `conda_file`, `description`
`archive(environment_name, version)`	Archive a version	`environment_name`, `version`
`set_default(environment_name, version)`	Set default version	`environment_name`, `version`

Model Management

client.models - Manage Azure ML models

Method	Description	Parameters
`list(page, per_page)`	List all models	`page` (int), `per_page` (int)
`get(name)`	Get model by name	`name` (str)
`create(name, description, tags)`	Create new model	`name`, `description`, `tags`
`upload_and_create(file_path, name, description, tags)`	Upload file and create model	`file_path`, `name`, `description`, `tags`
`upload_buffer_and_create(buffer, filename, name, description, tags)`	Upload buffer and create model	`buffer`, `filename`, `name`, `description`, `tags`
`delete(name)`	Delete model	`name` (str)

client.model_versions - Manage model versions

Method	Description	Parameters
`list(model_name, page, per_page)`	List versions for model	`model_name`, `page`, `per_page`
`register_from_file(file_path, model_name, version, description, tags)`	Register version from file	`file_path`, `model_name`, `version`, `description`, `tags`
`register_from_buffer(buffer, filename, model_name, version, description, tags)`	Register version from buffer	`buffer`, `filename`, `model_name`, `version`, `description`, `tags`
`archive(model_name, version)`	Archive a version	`model_name`, `version`
`set_default(model_name, version)`	Set default version	`model_name`, `version`

Endpoint & Deployment Management

client.endpoints - Manage online endpoints

Method	Description	Parameters
`list(page, per_page)`	List all endpoints	`page` (int), `per_page` (int)
`get(name)`	Get endpoint by name	`name` (str)
`create(name)`	Create new endpoint	`name` (str)
`update(name, identity)`	Update endpoint identity	`name`, `identity` (EndpointIdentity)
`delete(name)`	Delete endpoint	`name` (str)
`infer(name, inputs)`	Perform inference on endpoint	`name`, `inputs` (dict)

client.deployments - Manage endpoint deployments

Method	Description	Parameters
`list(endpoint_name, page, per_page)`	List deployments for endpoint	`endpoint_name`, `page`, `per_page`
`get(name, endpoint_name)`	Get deployment by name	`name`, `endpoint_name`
`create(endpoint_name, name, model_name, model_version, environment_name, environment_version, instance_type, scoring_script_name, instance_count, traffic_percentage)`	Create deployment	See deployment parameters
`delete(name, endpoint_name)`	Delete deployment	`name`, `endpoint_name`
`update_traffic(endpoint_name, traffic_allocation)`	Update traffic allocation	`endpoint_name`, `traffic_allocation` (dict)

Inference & Messaging

client.inferences - Manage ML inferences

Method	Description	Parameters
`list(endpoint_name, page, per_page, has_annotation, is_test_generated, ...)`	List inferences with filters	Multiple filter parameters
`get(id)`	Get inference by ID	`id` (str)
`update(id, annotation, escalation_routes)`	Update inference	`id`, `annotation`, `escalation_routes`
`infer(endpoint_name, inputs, tenant_key, ip_address, is_test_generated)`	Perform direct inference	`endpoint_name`, `inputs`, optional params
`list_test_inferences(...)`	List only test-generated inferences	Same as `list()`

client.messages - Manage conversation messages

Method	Description	Parameters
`list(endpoint_name, conversation_id, page, per_page, is_test_generated, ...)`	List messages with filters	Multiple filter parameters
`get(id)`	Get message by ID	`id` (str)
`create(endpoint_name, text, email, tenant_key, llm_config_name, conversation_id, ip_address, auto_infer, is_test_generated)`	Create message	`endpoint_name`, `text`, `email`, optional params
`update(id, annotation, escalation_routes)`	Update message	`id`, `annotation`, `escalation_routes`
`list_test_messages(...)`	List only test-generated messages	Same as `list()`

client.conversations - Manage conversations

Method	Description	Parameters
`list(endpoint_name, page, per_page, is_escalated, is_resolved, is_test_generated, ...)`	List conversations with filters	Multiple filter parameters
`get(id)`	Get conversation by ID	`id` (str)
`update(id, is_escalated, is_resolved, annotation, escalation_routes)`	Update conversation	`id`, optional params
`escalate(id, escalation_routes)`	Escalate conversation	`id`, `escalation_routes`
`resolve(id)`	Resolve conversation	`id` (str)
`list_test_conversations(...)`	List only test-generated conversations	Same as `list()`

LLM Configuration & Inference

client.llm - Configure and use LLM providers

Method	Description	Parameters
`list(page, per_page, is_archived)`	List LLM configurations	`page`, `per_page`, `is_archived`
`get(config_name)`	Get LLM config by name	`config_name` (str)
`create(config_name, provider, model, api_key, endpoint, deployment_name, default_parameters)`	Create LLM config	See LLM parameters
`update(config_name, provider, model, api_key, endpoint, deployment_name, default_parameters)`	Update LLM config	Same as `create()`
`archive(config_name)`	Archive LLM config	`config_name` (str)
`delete(config_name)`	Delete LLM config	`config_name` (str)
`infer(config_name, messages, parameters, is_test_generated)`	Perform LLM inference	`config_name`, `messages`, optional params
`infer_stream(config_name, messages, parameters, is_test_generated)`	Stream LLM inference	Same as `infer()`

Secrets & Scoring Scripts

client.secrets - Manage Azure Key Vault secrets

Method	Description	Parameters
`list(page, per_page)`	List all secrets	`page` (int), `per_page` (int)
`create(name, value)`	Create new secret	`name`, `value`
`update(name, value)`	Update secret value	`name`, `value`
`delete(name)`	Delete secret	`name` (str)

client.scorings - Manage scoring scripts

Method	Description	Parameters
`list(page, per_page)`	List all scoring scripts	`page` (int), `per_page` (int)
`get(name)`	Get scoring script by name	`name` (str)
`create(name, content)`	Create scoring script	`name`, `content`
`create_from_file(name, file_path)`	Create from file	`name`, `file_path`

Test Data Management

The SDK includes built-in support for test data management:

Automatic Filtering: By default, all list() methods hide test-generated data
Test Data Creation: Use TestDataManager to create test data marked with is_test_generated=True
Convenience Methods: Use list_test_*() methods to view only test data
Cleanup Utilities: Built-in methods to clean up test data after testing

Class	Description	Key Methods
`TestDataManager`	Test data utilities	`create_test_message()`, `cleanup_all_test_data()`, `count_test_data()`

Test Data Filtering Options

All list methods support the is_test_generated parameter:

None (default): Hide test-generated objects
'false': Explicitly hide test-generated objects
'true': Show only test-generated objects
'all': Show all objects regardless of test generation status

Development

# Install development dependencies
pip install -e .[dev]

# Run unit tests only (no live API calls)
make test

# Run unit tests with coverage
make test-cov

# Run integration tests (requires live API access)
make test-integration

# Run all tests
make test-all

License

MIT License - see LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.2

Mar 17, 2026

0.2.1

Feb 6, 2026

This version

0.2.0

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

az_cortex_sdk-0.2.0.tar.gz (85.3 kB view details)

Uploaded Feb 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

az_cortex_sdk-0.2.0-py3-none-any.whl (67.3 kB view details)

Uploaded Feb 4, 2026 Python 3

File details

Details for the file az_cortex_sdk-0.2.0.tar.gz.

File metadata

Download URL: az_cortex_sdk-0.2.0.tar.gz
Upload date: Feb 4, 2026
Size: 85.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for az_cortex_sdk-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`975895753b41a7e5951ddf5c214a6e68262f542fb6cfcc3cbe05829838b8a28a`
MD5	`92b59e79b642cdcf6c4e0225f9f84f5d`
BLAKE2b-256	`ea82fffa51f4545f568dc1f0954e546dfe15dc93d8b3530b3a682442d18e8a89`

See more details on using hashes here.

File details

Details for the file az_cortex_sdk-0.2.0-py3-none-any.whl.

File metadata

Download URL: az_cortex_sdk-0.2.0-py3-none-any.whl
Upload date: Feb 4, 2026
Size: 67.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for az_cortex_sdk-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ea63cfe6eae68b82f7eb404fe9d145876fe080d922ba369cf29338f08c72bd0`
MD5	`cab60d53b0470d2ff2ddde9833b0792c`
BLAKE2b-256	`52fccf2be1fa9804a9b58582be1bc2972cb745621a0acee308b1a85236679a36`

See more details on using hashes here.

az-cortex-sdk 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Azure Cortex SDK

Features

Core ML Operations

Inference & LLM

Developer Experience

Installation

Quick Start

Configuration

Documentation

Getting Started

Core Guides

Reference

Examples

Environment & Model Management

Endpoint & Deployment

LLM Configuration & Inference

Messages & Conversations

Inference Management

Secrets & Scoring Scripts

Test Data Management

API Reference

Core Services

Environment Management

Model Management

Endpoint & Deployment Management

Inference & Messaging

LLM Configuration & Inference

Secrets & Scoring Scripts

Test Data Management

Test Data Filtering Options

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes