Azure provider backend plugin for mngr
Project description
mngr Azure Provider [experimental]
Azure provider backend plugin for mngr. Runs agents in Docker containers on Azure Virtual Machines.
This plugin is experimental — it has not been exercised in a production setting at the same scale as
mngr_modalormngr_vultr. The sharedmngr_vps_dockermachinery underneath it is well-tested, but Azure-specific defaults and the role/permission set may change. Treat the security defaults (see "Azure-specific configuration" below) as a starting point: review the NSG ingress CIDRs, image choice, VM size, andauto_shutdown_secondsbefore pointing this at production resources.
See mngr_vps_docker for the base architecture and shared infrastructure.
Setup
Credentials are resolved exclusively via Azure's DefaultAzureCredential — they
are deliberately not configurable in mngr.toml (matching the Modal / AWS / GCP
provider convention). Any of the following works:
az login(developer laptop) — the credential transparently uses your Azure CLI session- Service principal env vars:
AZURE_CLIENT_ID,AZURE_TENANT_ID,AZURE_CLIENT_SECRET(CI) - A managed identity (when running on an Azure VM / Container App)
The subscription is resolved automatically from your az login — after az login
(and optionally az account set --subscription <id>), --provider azure works
with no config at all, the same way the GCP provider uses your active gcloud
project. Resolution order: providers.azure.subscription_id in config >
AZURE_SUBSCRIPTION_ID env var > the Azure CLI's active subscription.
So a [providers.azure] block is entirely optional. Configure one only to pin a
non-default subscription or override defaults:
[providers.azure]
backend = "azure"
subscription_id = "00000000-0000-0000-0000-000000000000" # optional; defaults to your `az` active subscription
default_region = "westus"
default_vm_size = "Standard_B2s" # 2 vCPU / 4GB; B-series is quota-friendly on new subs
# One-off infrastructure names (created by `mngr azure prepare`)
resource_group = "mngr"
vnet_name = "mngr-vnet"
subnet_name = "mngr-subnet"
nsg_name = "mngr-nsg"
# Inbound CIDRs for tcp/22 and the container SSH port on the NSG. Defaults to
# the wide-open '0.0.0.0/0' (fail-open, matching the AWS / GCP providers; a
# warning is logged -- tighten for production). SSH auth is key-only (passwords
# disabled), so 0.0.0.0/0 exposes the port but not a usable login. Use a tight
# range like ['203.0.113.4/32'], or [] for no SSH allow rule (the NSG default
# deny then leaves instances unreachable from outside the vnet).
allowed_ssh_cidrs = ["203.0.113.4/32"]
# Optional OS disk sizing
os_disk_size_gb = 30
os_disk_type = "StandardSSD_LRS"
One-time setup: mngr azure prepare
Azure nests every resource in a resource group, and a fresh subscription has no
default vnet. mngr azure prepare does the one-time privileged setup: it
registers the Microsoft.Compute / Microsoft.Network / Microsoft.Storage
resource providers and creates the resource group, vnet, subnet, and NSG (tagged
managed-by=mngr). After it succeeds, mngr create --provider azure needs only
VM/NIC/IP-create permissions, not the network-management permissions that build the
vnet/subnet/NSG — it just resolves the existing subnet, so you can run it with
limited credentials.
mngr azure prepare --allowed-ssh-cidr 203.0.113.4/32
Like AWS and GCP, prepare is fail-open: with no --allowed-ssh-cidr it falls
back to the provider config's allowed_ssh_cidrs (default 0.0.0.0/0, open to
the internet) and logs a warning prompting you to tighten it. SSH auth is
key-only (passwords disabled), so an open NSG exposes the port but not a usable
login. Setting allowed_ssh_cidrs = [] opts out entirely: the NSG is created
with no SSH allow rule, so its default-deny leaves instances unreachable from
outside the vnet.
Idempotent — re-running is a no-op when everything already exists.
prepare and cleanup read their defaults from your [providers.<name>]
settings.toml block, selected with --provider (default azure), so the
resource group / vnet / subnet / NSG land with the same names the runtime mngr create --provider <name> path will resolve. CLI flags override the resolved
config, which in turn overrides class defaults. For example, with a
[providers.azure-west] block pinning default_region = "westus",
resource_group = "mngr-westus", and allowed_ssh_cidrs = ["203.0.113.4/32"]:
mngr azure prepare --provider azure-west # uses that block's region / RG / CIDRs, no flags needed
Teardown: mngr azure cleanup
The safe inverse of prepare. Deletes the mngr-owned resource group (cascading
its vnet/subnet/NSG), but refuses while any mngr-managed VM still exists in
the group (destroy those first with mngr destroy <agent>), and only deletes a
group it owns (tagged managed-by=mngr). Idempotent.
mngr azure cleanup
Quota note
New pay-as-you-go subscriptions start with low or zero vCPU quota per region
and per VM family. The default Standard_B2s (B-series) is the family most
likely to have nonzero quota; if mngr create fails with a quota error, request
an increase in the Azure portal (Subscriptions → Usage + quotas) or pick a region
with available quota (az vm list-usage --location westus -o table).
Multiple regions
Each provider instance is bound to a single region (and resource group). To work across regions, configure one instance per region and pick the right one at create time:
[providers.azure-west]
backend = "azure"
subscription_id = "..."
default_region = "westus"
resource_group = "mngr-westus"
allowed_ssh_cidrs = ["203.0.113.4/32"]
[providers.azure-east]
backend = "azure"
subscription_id = "..."
default_region = "eastus"
resource_group = "mngr-eastus"
allowed_ssh_cidrs = ["203.0.113.4/32"]
mngr azure prepare --provider azure-west # reads region / RG / CIDRs from [providers.azure-west]
mngr create my-west-agent --provider azure-west
Usage
mngr create my-agent --provider azure
mngr create my-agent --provider azure -b --azure-vm-size=Standard_D2s_v5 -b --azure-region=eastus
mngr create my-agent --provider azure -b --azure-spot # run on Azure Spot capacity
mngr list
mngr exec my-agent "echo hello"
mngr stop my-agent
mngr start my-agent
mngr destroy my-agent
mngr stop stops the container and then deallocates the VM, which actually
halts compute billing (an OS-level shutdown would only power it off — "Stopped
(not deallocated)" — and keep billing); the OS disk and all state persist, so a
paused agent costs only disk storage. mngr start re-allocates it. The public IP
is static, so it and the SSH host keys survive the stop (no known_hosts rebind on
resume). A deallocated VM still shows in mngr list and resolves by name (offline
discovery via VM tags). mngr destroy deletes the VM, and the NIC, public IP and
OS disk are reaped automatically via their delete_option=Delete (no orphaned
resources).
If a mngr create fails after the public IP + NIC are provisioned but before
the VM (e.g. an Azure SkuNotAvailable capacity error), those are cleaned up —
immediately when possible, or otherwise reclaimed at GC time by mngr gc (which
also runs after every mngr destroy) (Azure reserves the NIC for the would-be VM
for 180s, so immediate deletion can be briefly blocked). A SkuNotAvailable error means the chosen VM
size has no capacity in the region right now; pick another size with
-b --azure-vm-size=... or another region.
How it works
- Per-host create: a Standard-SKU static public IP + a NIC bound to the
prepared subnet + a VM. The OS disk, NIC, and public IP are all created with
delete_option=Delete, so deleting the VM cascades all four —destroyis a single VM delete. - SSH keys are injected inline at VM create (
os_profile.linux_configuration.ssh); Azure has no per-key resource. Cloud-init also forwards the key into root'sauthorized_keys, so mngr's root SSH works. - Image: Debian 12 by default (matching the other mngr providers; runs
cloud-init with the Azure datasource, so the shared
mngr_vps_dockerbootstrap works unchanged). Configurable viaimage_publisher/image_offer/image_sku/image_version. - No snapshot workflow: the Azure client exposes no managed-disk-snapshot surface (the speculative
create_snapshot/list_snapshots/delete_snapshotclient methods are not part ofVpsClientInterface). Restore from a freshmngr createinstead. - Spot (
--azure-spot):priority=Spot,eviction_policy=Delete,max_price=-1— evicted only on capacity, and deleted (not stopped) on eviction, matching AWS spot's terminate-on-reclaim. - VMs are tagged
mngr-provider,mngr-host-id,mngr-created-at,managed-by=mngr, andmngr-host-name; discovery filters the resource group's VM list bymngr-provider. Per-agent records are mirrored into VM tags (mngr-agent-<id>-<field>) so a deallocated VM still lists its agents and resolves by name; offline discovery reconstructs deallocated/stopped VMs from those tags (the VM list is fetched withexpand=instanceViewto read power state). - Stop/start = deallocate/start:
mngr stopdeallocates the VM (virtual_machines.begin_deallocate) to halt compute billing;mngr startre-allocates it (begin_start). The static public IP and on-disk SSH host keys persist, so resume needs no IP/known_hosts fixup. Mirrorsmngr_aws/mngr_gcp; the sharedmngr_vps_dockerbase is untouched. - Idle self-deallocate (managed identity): each VM is created with a
system-assigned managed identity. The in-container idle watcher touches a
sentinel; a host-side systemd path unit runs a script that uses the VM's IMDS
token to call the ARM
deallocateAPI on itself (the only in-guest way to halt Azure compute billing — an OS shutdown does not).mngr azure preparecreates a least-privilege custom role (mngr-self-deallocate, justMicrosoft.Compute/virtualMachines/deallocate/action+read), and each VM gets a role assignment scoped to itself. Graceful fallback: if the operator lacksMicrosoft.Authorization/roleAssignments/roleDefinitionswrite (Owner / User Access Administrator), the role steps are skipped with a clear warning and idle self-deallocate is disabled; on a refused deallocate the in-VM script just logs and exits (it does not poweroff — an Azure OS shutdown would only strand the VM unreachable while it keeps billing).mngr stop/startstill deallocate normally, and remain the only way to halt billing on such a host.
Auto-shutdown and cost safety
Two independent mechanisms:
- Idle self-deallocate (the primary, cost-parity path): an idle agent
deallocates its own VM via its managed identity (see "How it works"), genuinely
halting compute billing — even if the orchestrating
mngrprocess is gone. Requires the operator to have granted the role assignment (otherwise it is disabled and onlymngr stophalts billing — an in-VM OS shutdown does not). auto_shutdown_secondsschedules cloud-initshutdown -P +Nas a coarse time cap. Caveat (Azure specific): this OS-level shutdown alone leaves the VM "Stopped (not deallocated)", which still bills for compute. For test isolation the real backstop is the session-end orphan scanner inconftest.py, which force-deletes any VM taggedmngr-pytest-launchedolder than the TTL.
Future improvements
- Custom-image baking (skip the per-create cloud-init Docker install).
- Azure Resource Graph for cross-region listing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imbue_mngr_azure-0.1.1.tar.gz.
File metadata
- Download URL: imbue_mngr_azure-0.1.1.tar.gz
- Upload date:
- Size: 82.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b086bcad80c1de59aa8ce77e86e1e962532444d2775ad9ca7a2e57d7572cfb05
|
|
| MD5 |
e859376a956888ff91897133db6f242f
|
|
| BLAKE2b-256 |
588578108a20a8fa7c12eb8f2269362f66f8dedb3cfb98d389a885e96f6a16f2
|
Provenance
The following attestation bundles were made for imbue_mngr_azure-0.1.1.tar.gz:
Publisher:
publish.yml on imbue-ai/mngr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
imbue_mngr_azure-0.1.1.tar.gz -
Subject digest:
b086bcad80c1de59aa8ce77e86e1e962532444d2775ad9ca7a2e57d7572cfb05 - Sigstore transparency entry: 1858307984
- Sigstore integration time:
-
Permalink:
imbue-ai/mngr@ed465f3fc2a0b4ad935d473b03bcffb205fac223 -
Branch / Tag:
refs/tags/v0.2.17 - Owner: https://github.com/imbue-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ed465f3fc2a0b4ad935d473b03bcffb205fac223 -
Trigger Event:
push
-
Statement type:
File details
Details for the file imbue_mngr_azure-0.1.1-py3-none-any.whl.
File metadata
- Download URL: imbue_mngr_azure-0.1.1-py3-none-any.whl
- Upload date:
- Size: 45.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a188dae7a5e967a469681ad40f0f5a669a28dd4246845ea32a6828efc1d1967
|
|
| MD5 |
fbe0acca89e4f7855b02ce63b6e115b1
|
|
| BLAKE2b-256 |
a069ac3c4d8f273483b5e02ff3f8e9e1083d3f055c007940a2346d11076d6608
|
Provenance
The following attestation bundles were made for imbue_mngr_azure-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on imbue-ai/mngr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
imbue_mngr_azure-0.1.1-py3-none-any.whl -
Subject digest:
4a188dae7a5e967a469681ad40f0f5a669a28dd4246845ea32a6828efc1d1967 - Sigstore transparency entry: 1858308195
- Sigstore integration time:
-
Permalink:
imbue-ai/mngr@ed465f3fc2a0b4ad935d473b03bcffb205fac223 -
Branch / Tag:
refs/tags/v0.2.17 - Owner: https://github.com/imbue-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ed465f3fc2a0b4ad935d473b03bcffb205fac223 -
Trigger Event:
push
-
Statement type: