Azure OpenAI PTU cost advisor plugin for az-scout — compare PAYGO, PTU, and hybrid deployment strategies
Project description
az-scout-plugin-aoai-ptu-advisor
Azure OpenAI PTU Cost Advisor — an az-scout plugin that compares PAYGO, PTU (Provisioned Throughput), and Hybrid deployment cost strategies for Azure OpenAI models.
What it does
Given your workload characteristics (average TPM, burst percentiles, usage hours), the plugin:
- Estimates monthly costs for PAYGO, PTU on-demand, PTU reserved, and hybrid (PTU base + PAYGO spillover)
- Sizes PTU capacity with correct minimum and increment rounding per model and deployment type
- Detects burst patterns and recommends hybrid when steady base + bursty overflow is cheaper
- Computes break-even TPM thresholds between PAYGO and PTU
- Fetches live pricing from the Azure Retail Prices API with fallback to catalog defaults
- Supports price overrides for enterprise/negotiated rates
Inspiration
This plugin is inspired by the public project azureptucalc by Ricardo Martins. The functional goals are the same — compare Azure OpenAI deployment cost strategies — but the implementation is completely native to az-scout:
- No React — vanilla JavaScript frontend
- No Vercel proxy — direct Azure Retail Prices API calls from the Python backend
- No external dependencies — no database, no KQL, no cache layer
- MCP tools — AI agents can call the calculator directly
- Plugin architecture — auto-discovered by az-scout at startup
Installation
# Install the plugin (az-scout must be installed first)
uv pip install az-scout-plugin-aoai-ptu-advisor
# Or for development
cd az-scout-plugin-aoai-ptu-advisor
uv sync --group dev
uv pip install -e .
# Start az-scout — the plugin is auto-discovered
az-scout web
API Routes
All routes are under /plugins/aoai-ptu-advisor/.
GET /models
Returns supported Azure OpenAI models with PTU specifications.
GET /pricing?region=eastus&model=gpt-4o&deployment_type=Global
Returns resolved PAYGO and PTU pricing for a model/region/deployment type.
POST /analyze
Main analysis endpoint. Request body:
{
"region": "eastus",
"model": "gpt-4o",
"deployment_type": "Global",
"average_tpm": 50000,
"p95_tpm": 80000,
"p99_tpm": 120000,
"input_ratio": 0.5,
"hours_per_day": 24,
"days_per_month": 30,
"reserved_ptu": false,
"spillover_enabled": true,
"paygo_overrides": {
"input_per_1m": 2.50,
"output_per_1m": 10.00
}
}
Response includes: cost estimates for each strategy, PTU sizing, break-even analysis, recommendation with rationale, warnings, and assumptions.
POST /analyze-from-timeseries
Accepts raw TPM time-series data (JSON array of {timestamp, tpm} objects), computes percentiles automatically, and runs the analysis.
{
"region": "eastus",
"model": "gpt-4o",
"datapoints": [
{"timestamp": "2025-01-01T00:00:00Z", "tpm": 50000},
{"timestamp": "2025-01-01T01:00:00Z", "tpm": 120000},
{"timestamp": "2025-01-01T02:00:00Z", "tpm": 30000}
]
}
MCP Tools
Available in the az-scout AI chat:
| Tool | Description |
|---|---|
list_aoai_models |
List supported models with PTU specs |
get_aoai_pricing |
Get resolved pricing for a model/region |
analyze_aoai_ptu_cost |
Full cost comparison analysis |
Supported Models
Model catalog is fetched dynamically from the Microsoft PTU onboarding documentation (cached 24h, static fallback on failure). Currently includes GPT-5.x, GPT-4.1, GPT-4o, o-series, DeepSeek, and Llama models.
Azure Auto-fill
The plugin can populate workload inputs directly from your Azure environment:
- Select your subscription and Azure OpenAI resource
- Click "Auto-fill from Azure"
- The plugin queries Azure Monitor Logs (KQL) and the ARM model capacity API
- Form fields are populated with live metrics (avg TPM, P99 TPM, recommended PTU)
Azure routes
| Route | Method | Description |
|---|---|---|
/azure/openai-resources |
GET | List Azure OpenAI resources in a subscription |
/azure/kql-autofill |
POST | Query Azure Monitor Logs for TPM/PTU metrics |
/azure/model-capacity |
POST | Check available PTU capacity (preview API) |
/azure/autofill-all |
POST | Orchestrate KQL + capacity in one call |
Required Azure permissions
- Reader on the subscription (for resource discovery)
- Log Analytics Reader on the Log Analytics workspace (for KQL queries)
- Reader on the Cognitive Services account (for model capacity API)
Prerequisites for KQL autofill
Azure Monitor metrics must be exported to a Log Analytics workspace via diagnostic settings on the Azure OpenAI resource. If not configured, the plugin will return a clear warning.
Model capacity API
The capacity API (Microsoft.CognitiveServices/locations/{location}/modelCapacities) is a preview API (2024-04-01-preview). It returns location-based capacity data. The deployment type (Global/DataZone/Regional) is used as a UI filter over SKU names.
Limitations
- Pricing accuracy: Live prices from the Azure Retail Prices API may not reflect negotiated enterprise rates. Use price overrides for accurate estimates.
- No persistent storage: All calculations are stateless — no database or cache.
- Throughput-per-PTU values: Fetched dynamically from Microsoft documentation; may change as models evolve.
- KQL autofill: Requires diagnostic settings configured on the Azure OpenAI resource to export metrics to Log Analytics.
- Capacity API: Preview API — results may be incomplete or change without notice.
Quality Checks
uv run ruff check src/ tests/
uv run ruff format --check src/ tests/
uv run mypy src/
uv run pytest
License
Disclaimer
This tool is not affiliated with Microsoft. Pricing, throughput, and capacity information are indicative and not a guarantee. PTU throughput-per-unit values are based on published documentation and may change. Always validate with actual Azure pricing and your account team for enterprise rates.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file az_scout_plugin_aoai_ptu_advisor-2026.3.31.tar.gz.
File metadata
- Download URL: az_scout_plugin_aoai_ptu_advisor-2026.3.31.tar.gz
- Upload date:
- Size: 157.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95cc5de2a5a29fb49b607c274a995740c73b6adb2a3dd156a2d752caa4244106
|
|
| MD5 |
0f4c196b403e887adc0e2100cdac63d2
|
|
| BLAKE2b-256 |
499a3a387f1447f2d168fa988e7fa098809f8d05d45e31612ec7cc9f79f23d4d
|
Provenance
The following attestation bundles were made for az_scout_plugin_aoai_ptu_advisor-2026.3.31.tar.gz:
Publisher:
publish.yml on az-scout/az-scout-plugin-aoai-ptu-advisor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
az_scout_plugin_aoai_ptu_advisor-2026.3.31.tar.gz -
Subject digest:
95cc5de2a5a29fb49b607c274a995740c73b6adb2a3dd156a2d752caa4244106 - Sigstore transparency entry: 1203303185
- Sigstore integration time:
-
Permalink:
az-scout/az-scout-plugin-aoai-ptu-advisor@174312e927bf330bb8e7d3635d75d27b98439d56 -
Branch / Tag:
refs/tags/v2026.03.31 - Owner: https://github.com/az-scout
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@174312e927bf330bb8e7d3635d75d27b98439d56 -
Trigger Event:
push
-
Statement type:
File details
Details for the file az_scout_plugin_aoai_ptu_advisor-2026.3.31-py3-none-any.whl.
File metadata
- Download URL: az_scout_plugin_aoai_ptu_advisor-2026.3.31-py3-none-any.whl
- Upload date:
- Size: 51.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6b80d2da08fc152df28c3930eaf903b8fe61a1373d38efb16e35b3143a85ef6
|
|
| MD5 |
695c870543d0e138493ef98ed3efead2
|
|
| BLAKE2b-256 |
1976fcc5715f4b62552ee66ac754da7d05823c0dda920c93a9bac583a6d2fbc8
|
Provenance
The following attestation bundles were made for az_scout_plugin_aoai_ptu_advisor-2026.3.31-py3-none-any.whl:
Publisher:
publish.yml on az-scout/az-scout-plugin-aoai-ptu-advisor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
az_scout_plugin_aoai_ptu_advisor-2026.3.31-py3-none-any.whl -
Subject digest:
d6b80d2da08fc152df28c3930eaf903b8fe61a1373d38efb16e35b3143a85ef6 - Sigstore transparency entry: 1203303193
- Sigstore integration time:
-
Permalink:
az-scout/az-scout-plugin-aoai-ptu-advisor@174312e927bf330bb8e7d3635d75d27b98439d56 -
Branch / Tag:
refs/tags/v2026.03.31 - Owner: https://github.com/az-scout
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@174312e927bf330bb8e7d3635d75d27b98439d56 -
Trigger Event:
push
-
Statement type: