Agent Sandbox: sandbox execution isolation for AI agents

These details have not been verified by PyPI

Project links

Project description

Agent Sandbox

Public Preview — execution isolation for AI agents with policy-driven resource limits, tool proxies, network enforcement, and filesystem checkpointing. Ships three interchangeable backends behind the same SandboxProvider ABC.

Part of the Agent Governance Toolkit.

Providers at a glance

Provider	Isolation primitive	Best for	Extra
`DockerSandboxProvider`	Hardened OCI container (runc, auto-upgrades to gVisor / Kata)	Local dev, CI, self-hosted runners	`agt-sandbox[docker]`
`HyperLightSandboxProvider`	KVM / mshv / WHP micro-VM via hyperlight-sandbox	Sub-millisecond cold start, per-call VM isolation	`agt-sandbox[hyperlight]`
`ACASandboxProvider`	Azure Container Apps sandbox (managed)	Production, multi-tenant, no infra to run	`agt-sandbox[azure]` + the early-access SDK wheel

All three implement the same async + sync API (create_session, execute_code, destroy_session, plus *_async variants) and consume the same PolicyDocument for resource caps, network allowlists, and tool allowlists.

Installation

# Everything (Docker + Hyperlight + policy engine):
pip install "agt-sandbox[full]"

# Pick what you need:
pip install "agt-sandbox[docker]"
pip install "agt-sandbox[hyperlight]"
pip install "agt-sandbox[azure,policy]"

The Azure data-plane SDK ships as an early-access wheel — pin the URL:

pip install https://github.com/microsoft/azure-container-apps/releases/download/python-sdk-v0.1.0b1-early-access/azure_containerapps_sandbox-0.1.0b1-py3-none-any.whl

Quick start (all three providers)

from agent_sandbox import (
    DockerSandboxProvider,
    HyperLightSandboxProvider,
    ACASandboxProvider,
)

# Pick one:
provider = DockerSandboxProvider()
# provider = HyperLightSandboxProvider(backend="wasm")
# provider = ACASandboxProvider(
#     resource_group="my-rg", sandbox_group="agents",
#     region="eastus2", disk="python-3.13",
#     ensure_group_location="eastus2",
# )

handle = provider.create_session("agent-1")
out = provider.execute_code("agent-1", handle.session_id, "print('hello')")
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)

1. `DockerSandboxProvider` — local hardened containers

Each agent session runs in its own container with capabilities dropped, no privilege escalation, a read-only root filesystem, a non-root user, and no network by default.

import asyncio
from agent_sandbox import (
    DockerSandboxProvider,
    IsolationRuntime,
    SandboxConfig,
)

async def run_agent_task():
    provider = DockerSandboxProvider(
        image="python:3.12-slim",
        runtime=IsolationRuntime.AUTO,   # auto-upgrade to gVisor / Kata
    )
    config = SandboxConfig(
        timeout_seconds=30,
        memory_mb=256,
        cpu_limit=0.5,
        network_enabled=False,
        read_only_fs=True,
    )

    session = await provider.create_session_async("research-agent", config=config)
    try:
        execution = await provider.execute_code_async(
            "research-agent", session.session_id,
            "import json, math; print(json.dumps([math.sqrt(x) for x in range(5)]))",
        )
        print(execution.result.stdout)

        checkpoint = provider.save_state(
            "research-agent", session.session_id, "after-step-1",
        )
        print(f"Checkpoint saved: {checkpoint.image_tag}")
    finally:
        await provider.destroy_session_async("research-agent", session.session_id)

asyncio.run(run_agent_task())

What the Docker sandbox enforces

Control	Default
Linux capabilities	All dropped (`--cap-drop=ALL`)
Privilege escalation	Blocked (`--security-opt=no-new-privileges`)
Root filesystem	Read-only
Container user	`nobody` (UID 65534)
PID limit	256
Network	Disabled unless explicitly allowed
Runtime	`runc` (auto-upgrades to gVisor or Kata when available)
State	`save_state` / `restore_state` via image commit

2. `HyperLightSandboxProvider` — micro-VM isolation

Backed by the upstream hyperlight-sandbox runtime. Each session is a fresh micro-VM on KVM (Linux), mshv (Azure HCL), or WHP (Windows) — typical cold start is well under a millisecond. Tools are registered as host functions and invoked synchronously from the guest, gated by the session's policy.tool_allowlist.

from agent_sandbox import HyperLightSandboxProvider

def fetch_arxiv(query: str) -> str:
    return f"<results for {query}>"

provider = HyperLightSandboxProvider(
    backend="wasm",                 # or "hyperlightjs" / "nanvix"
    module="python_guest",          # only meaningful for backend="wasm"
    tools={"fetch_arxiv": fetch_arxiv},
)

if not provider.is_available():
    raise SystemExit(f"Hyperlight unavailable: {provider.unavailable_reason}")

handle = provider.create_session("agent-1")
out = provider.execute_code(
    "agent-1", handle.session_id,
    "print(fetch_arxiv('cs.CL'))",
)
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)

Notes:

Each session owns one OS thread that is the sole code path touching its Sandbox — required by the upstream runtime.
provider.is_available() probes for a hypervisor and returns unavailable_reason if none is present (e.g. on macOS hosts without WHP / KVM passthrough).
Only tools listed in a session's policy.tool_allowlist are exposed to that session's guest; the rest stay host-side.

3. `ACASandboxProvider` — Azure Container Apps

Runs each session inside a managed Azure Container Apps sandbox via the early-access azure-containerapps-sandbox Python SDK (complete reference). Same API as the other providers; the rest of your code is unchanged.

pip install "agt-sandbox[azure,policy]"
pip install https://github.com/microsoft/azure-container-apps/releases/download/python-sdk-v0.1.0b1-early-access/azure_containerapps_sandbox-0.1.0b1-py3-none-any.whl

az login   # or use managed identity in hosted compute

from agent_sandbox import ACASandboxProvider

provider = ACASandboxProvider(
    resource_group="my-rg",          # must already exist
    sandbox_group="agents",          # auto-created if ensure_group_location is set
    region="eastus2",                # selects the data-plane endpoint
    subscription_id=None,            # falls back to AZURE_SUBSCRIPTION_ID env var
    disk="python-3.13",              # public disk image with python3 preinstalled
    ensure_group_location="eastus2", # create the sandbox group on first use
)

if not provider.is_available():
    raise SystemExit(f"ACA unavailable: {provider.unavailable_reason}")

handle = provider.create_session("agent-1")
out = provider.execute_code(
    "agent-1", handle.session_id, "print('hello azure')"
)
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
provider.close()

The provider holds one SandboxGroupClient per (resource_group, sandbox_group) pair and caches the per-sandbox SandboxClient returned by begin_create_sandbox().result(). When a PolicyDocument is supplied, network_allowlist is translated into a fail-closed egress policy (defaultAction: Deny + per-host Allow rules) and applied via SandboxClient.set_egress_policy. Set defaults.network_default: allow in the policy if you explicitly want the SDK's default-allow behaviour.

A complete worked example (8 verified branches against live Azure — allow / policy-deny / egress-block / sanity / tool-allowed / tool-denied / remote-execution proof / egress audit) lives at examples/quickstart/aca_sandbox_test.py and reads its policy from examples/quickstart/policies/aca_research_agent.yaml.

Policy-driven configuration

All three providers consume the same agent_os.policies.PolicyDocument. Sandbox resource caps, network allowlists, and tool allowlists are native fields on the schema as of AGT 3.3, so policies live in YAML:

name: research-agent
version: "2"

defaults:
  action: allow
  max_cpu: 1.0
  max_memory_mb: 2048
  timeout_seconds: 90
  network_default: deny

network_allowlist:
  - api.openai.com
  - "*.github.com"

tool_allowlist:
  - fetch_arxiv

rules:
  - name: deny-shell-out
    condition: { field: code, operator: contains, value: subprocess }
    action: deny
    priority: 100
    message: "shell-out blocked by research-agent policy"

from agent_os.policies import PolicyDocument

policy = PolicyDocument.from_yaml("policies/aca_research_agent.yaml")
handle = await provider.create_session_async("agent-1", policy=policy)

Hardened sandbox image (minimal-PATH)

docker/Dockerfile.sandbox is an opt-in hardened variant of the default python:3.11-slim base. It pins PATH to a single explicit directory (/usr/local/sandbox-bin) containing only the binaries sandboxed code is allowed to invoke, and strips the execute bit off well-known network and infra CLIs (curl, wget, ssh, git, az, aws, gcloud, kubectl, terraform, helm, ansible, apt, dpkg, …) as a second-layer guarantee in case a caller goes through an absolute path.

This closes the gap that issue #2662 identifies: without a pinned PATH, a tool can invoke os.system('az account list') inside the sandbox and the attempt is not blocked or logged by AGT even though the network-egress policy would later refuse the call. The hardened image makes the attempt itself fail with "command not found".

# Build with the default allow-list (python3, cat, echo, ls, sleep).
docker build \
  -f agent-sandbox/docker/Dockerfile.sandbox \
  -t agt-sandbox/python-minimal-path:3.11 \
  agent-sandbox/docker

# Build with a custom allow-list — add only what the sandboxed workload
# actually needs. The full allow-list IS the new PATH; any binary not listed
# here is unreachable.
docker build \
  --build-arg ALLOWED_BIN_NAMES="python3 cat echo ls sleep grep sort uniq" \
  -f agent-sandbox/docker/Dockerfile.sandbox \
  -t agt-sandbox/python-minimal-path:3.11 \
  agent-sandbox/docker

Wire the image into DockerSandboxProvider via the existing image argument:

provider = DockerSandboxProvider(image="agt-sandbox/python-minimal-path:3.11")

For security-sensitive deployments, require the hardened image so the provider fails instead of silently falling back to python:3.11-slim when the local image is unavailable:

provider = DockerSandboxProvider(require_hardened_image=True)

Build the image before creating the provider. require_hardened_image=True cannot be combined with a custom image=.

To extend the allow-list permanently (rather than at docker build time), edit the ARG ALLOWED_BIN_NAMES= line in Dockerfile.sandbox and rebuild. The tests/test_docker_sandbox.py::TestMinimalPathSandboxImage smoke tests assert that the default allow-list cannot accidentally regress to include network or infra CLIs.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

4.1.0

Jun 11, 2026

4.0.1

Jun 5, 2026

4.0.0

Jun 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agt_sandbox-4.1.0.tar.gz (92.8 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agt_sandbox-4.1.0-py3-none-any.whl (50.0 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file agt_sandbox-4.1.0.tar.gz.

File metadata

Download URL: agt_sandbox-4.1.0.tar.gz
Upload date: Jun 11, 2026
Size: 92.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: RestSharp/106.13.0.0

File hashes

Hashes for agt_sandbox-4.1.0.tar.gz
Algorithm	Hash digest
SHA256	`563982cb9b00b55f455dac8415a35d899f0d59839c5f66bb399674a78d5ff5d3`
MD5	`772c9835a7ee1cd661f4f005d3c7db4b`
BLAKE2b-256	`5f848bc57af937d40ca84ae2bf76d36c4bb0b1159e754965b76cb1c7824f2f09`

See more details on using hashes here.

File details

Details for the file agt_sandbox-4.1.0-py3-none-any.whl.

File metadata

Download URL: agt_sandbox-4.1.0-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 50.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: RestSharp/106.13.0.0

File hashes

Hashes for agt_sandbox-4.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`35e5b6e16e0a9edb9bda29a030945d39694a881432360f918f4db211043c8475`
MD5	`4a09cb1ebc9d7fbccbdb3e25962476ff`
BLAKE2b-256	`1d95f21a1a6c4ca7b71f335950370e10d1bda9cc30d427eb3d52eaba0afbfd27`

See more details on using hashes here.

agt_sandbox 4.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agent Sandbox

Providers at a glance

Installation

Quick start (all three providers)

1. `DockerSandboxProvider` — local hardened containers

What the Docker sandbox enforces

2. `HyperLightSandboxProvider` — micro-VM isolation

3. `ACASandboxProvider` — Azure Container Apps

Policy-driven configuration

Hardened sandbox image (minimal-PATH)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

agt_sandbox 4.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agent Sandbox

Providers at a glance

Installation

Quick start (all three providers)

1. DockerSandboxProvider — local hardened containers

What the Docker sandbox enforces

2. HyperLightSandboxProvider — micro-VM isolation

3. ACASandboxProvider — Azure Container Apps

Policy-driven configuration

Hardened sandbox image (minimal-PATH)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. `DockerSandboxProvider` — local hardened containers

2. `HyperLightSandboxProvider` — micro-VM isolation

3. `ACASandboxProvider` — Azure Container Apps