Agent Sandbox: sandbox execution isolation for AI agents
Project description
Agent Sandbox
Public Preview — execution isolation for AI agents with policy-driven
resource limits, tool proxies, network enforcement, and filesystem
checkpointing. Ships three interchangeable backends behind the same
SandboxProvider ABC.
Part of the Agent Governance Toolkit.
Providers at a glance
| Provider | Isolation primitive | Best for | Extra |
|---|---|---|---|
DockerSandboxProvider |
Hardened OCI container (runc, auto-upgrades to gVisor / Kata) | Local dev, CI, self-hosted runners | agt-sandbox[docker] |
HyperLightSandboxProvider |
KVM / mshv / WHP micro-VM via hyperlight-sandbox | Sub-millisecond cold start, per-call VM isolation | agt-sandbox[hyperlight] |
ACASandboxProvider |
Azure Container Apps sandbox (managed) | Production, multi-tenant, no infra to run | agt-sandbox[azure] + the early-access SDK wheel |
All three implement the same async + sync API (create_session,
execute_code, destroy_session, plus *_async variants) and consume
the same PolicyDocument for resource caps, network allowlists, and
tool allowlists.
Installation
# Everything (Docker + Hyperlight + policy engine):
pip install "agt-sandbox[full]"
# Pick what you need:
pip install "agt-sandbox[docker]"
pip install "agt-sandbox[hyperlight]"
pip install "agt-sandbox[azure,policy]"
The Azure data-plane SDK ships as an early-access wheel — pin the URL:
pip install https://github.com/microsoft/azure-container-apps/releases/download/python-sdk-v0.1.0b1-early-access/azure_containerapps_sandbox-0.1.0b1-py3-none-any.whl
Quick start (all three providers)
from agent_sandbox import (
DockerSandboxProvider,
HyperLightSandboxProvider,
ACASandboxProvider,
)
# Pick one:
provider = DockerSandboxProvider()
# provider = HyperLightSandboxProvider(backend="wasm")
# provider = ACASandboxProvider(
# resource_group="my-rg", sandbox_group="agents",
# region="eastus2", disk="python-3.13",
# ensure_group_location="eastus2",
# )
handle = provider.create_session("agent-1")
out = provider.execute_code("agent-1", handle.session_id, "print('hello')")
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
1. DockerSandboxProvider — local hardened containers
Each agent session runs in its own container with capabilities dropped, no privilege escalation, a read-only root filesystem, a non-root user, and no network by default.
import asyncio
from agent_sandbox import (
DockerSandboxProvider,
IsolationRuntime,
SandboxConfig,
)
async def run_agent_task():
provider = DockerSandboxProvider(
image="python:3.12-slim",
runtime=IsolationRuntime.AUTO, # auto-upgrade to gVisor / Kata
)
config = SandboxConfig(
timeout_seconds=30,
memory_mb=256,
cpu_limit=0.5,
network_enabled=False,
read_only_fs=True,
)
session = await provider.create_session_async("research-agent", config=config)
try:
execution = await provider.execute_code_async(
"research-agent", session.session_id,
"import json, math; print(json.dumps([math.sqrt(x) for x in range(5)]))",
)
print(execution.result.stdout)
checkpoint = provider.save_state(
"research-agent", session.session_id, "after-step-1",
)
print(f"Checkpoint saved: {checkpoint.image_tag}")
finally:
await provider.destroy_session_async("research-agent", session.session_id)
asyncio.run(run_agent_task())
What the Docker sandbox enforces
| Control | Default |
|---|---|
| Linux capabilities | All dropped (--cap-drop=ALL) |
| Privilege escalation | Blocked (--security-opt=no-new-privileges) |
| Root filesystem | Read-only |
| Container user | nobody (UID 65534) |
| PID limit | 256 |
| Network | Disabled unless explicitly allowed |
| Runtime | runc (auto-upgrades to gVisor or Kata when available) |
| State | save_state / restore_state via image commit |
2. HyperLightSandboxProvider — micro-VM isolation
Backed by the upstream hyperlight-sandbox
runtime. Each session is a fresh micro-VM on KVM (Linux), mshv (Azure
HCL), or WHP (Windows) — typical cold start is well under a millisecond.
Tools are registered as host functions and invoked synchronously from
the guest, gated by the session's policy.tool_allowlist.
from agent_sandbox import HyperLightSandboxProvider
def fetch_arxiv(query: str) -> str:
return f"<results for {query}>"
provider = HyperLightSandboxProvider(
backend="wasm", # or "hyperlightjs" / "nanvix"
module="python_guest", # only meaningful for backend="wasm"
tools={"fetch_arxiv": fetch_arxiv},
)
if not provider.is_available():
raise SystemExit(f"Hyperlight unavailable: {provider.unavailable_reason}")
handle = provider.create_session("agent-1")
out = provider.execute_code(
"agent-1", handle.session_id,
"print(fetch_arxiv('cs.CL'))",
)
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
Notes:
- Each session owns one OS thread that is the sole code path touching
its
Sandbox— required by the upstream runtime. provider.is_available()probes for a hypervisor and returnsunavailable_reasonif none is present (e.g. on macOS hosts without WHP / KVM passthrough).- Only tools listed in a session's
policy.tool_allowlistare exposed to that session's guest; the rest stay host-side.
3. ACASandboxProvider — Azure Container Apps
Runs each session inside a managed Azure Container Apps sandbox via the
early-access azure-containerapps-sandbox Python SDK
(complete reference).
Same API as the other providers; the rest of your code is unchanged.
pip install "agt-sandbox[azure,policy]"
pip install https://github.com/microsoft/azure-container-apps/releases/download/python-sdk-v0.1.0b1-early-access/azure_containerapps_sandbox-0.1.0b1-py3-none-any.whl
az login # or use managed identity in hosted compute
from agent_sandbox import ACASandboxProvider
provider = ACASandboxProvider(
resource_group="my-rg", # must already exist
sandbox_group="agents", # auto-created if ensure_group_location is set
region="eastus2", # selects the data-plane endpoint
subscription_id=None, # falls back to AZURE_SUBSCRIPTION_ID env var
disk="python-3.13", # public disk image with python3 preinstalled
ensure_group_location="eastus2", # create the sandbox group on first use
)
if not provider.is_available():
raise SystemExit(f"ACA unavailable: {provider.unavailable_reason}")
handle = provider.create_session("agent-1")
out = provider.execute_code(
"agent-1", handle.session_id, "print('hello azure')"
)
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
provider.close()
The provider holds one SandboxGroupClient per (resource_group, sandbox_group) pair and caches the per-sandbox SandboxClient returned
by begin_create_sandbox().result(). When a PolicyDocument is
supplied, network_allowlist is translated into a fail-closed egress
policy (defaultAction: Deny + per-host Allow rules) and applied via
SandboxClient.set_egress_policy. Set defaults.network_default: allow
in the policy if you explicitly want the SDK's default-allow behaviour.
A complete worked example (8 verified branches against live Azure —
allow / policy-deny / egress-block / sanity / tool-allowed /
tool-denied / remote-execution proof / egress audit) lives at
examples/quickstart/aca_sandbox_test.py
and reads its policy from
examples/quickstart/policies/aca_research_agent.yaml.
Policy-driven configuration
All three providers consume the same agent_os.policies.PolicyDocument.
Sandbox resource caps, network allowlists, and tool allowlists are
native fields on the schema as of AGT 3.3, so policies live in YAML:
name: research-agent
version: "2"
defaults:
action: allow
max_cpu: 1.0
max_memory_mb: 2048
timeout_seconds: 90
network_default: deny
network_allowlist:
- api.openai.com
- "*.github.com"
tool_allowlist:
- fetch_arxiv
rules:
- name: deny-shell-out
condition: { field: code, operator: contains, value: subprocess }
action: deny
priority: 100
message: "shell-out blocked by research-agent policy"
from agent_os.policies import PolicyDocument
policy = PolicyDocument.from_yaml("policies/aca_research_agent.yaml")
handle = await provider.create_session_async("agent-1", policy=policy)
Hardened sandbox image (minimal-PATH)
docker/Dockerfile.sandbox is an opt-in hardened variant of the default
python:3.11-slim base. It pins PATH to a single explicit directory
(/usr/local/sandbox-bin) containing only the binaries sandboxed code is
allowed to invoke, and strips the execute bit off well-known network and
infra CLIs (curl, wget, ssh, git, az, aws, gcloud, kubectl,
terraform, helm, ansible, apt, dpkg, …) as a second-layer guarantee
in case a caller goes through an absolute path.
This closes the gap that issue #2662
identifies: without a pinned PATH, a tool can invoke os.system('az account list')
inside the sandbox and the attempt is not blocked or logged by AGT even though
the network-egress policy would later refuse the call. The hardened image makes
the attempt itself fail with "command not found".
# Build with the default allow-list (python3, cat, echo, ls).
docker build \
-f agent-sandbox/docker/Dockerfile.sandbox \
-t agt-sandbox/python-minimal-path:3.11 \
agent-sandbox/docker
# Build with a custom allow-list — add only what the sandboxed workload
# actually needs. The full allow-list IS the new PATH; any binary not listed
# here is unreachable.
docker build \
--build-arg ALLOWED_BIN_NAMES="python3 cat echo ls grep sort uniq" \
-f agent-sandbox/docker/Dockerfile.sandbox \
-t agt-sandbox/python-minimal-path:3.11 \
agent-sandbox/docker
Wire the image into DockerSandboxProvider via the existing image argument:
provider = DockerSandboxProvider(image="agt-sandbox/python-minimal-path:3.11")
To extend the allow-list permanently (rather than at docker build time),
edit the ARG ALLOWED_BIN_NAMES= line in Dockerfile.sandbox and rebuild.
The tests/test_docker_sandbox.py::TestMinimalPathSandboxImage smoke tests
assert that the default allow-list cannot accidentally regress to include
network or infra CLIs.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agt_sandbox-4.0.1.tar.gz.
File metadata
- Download URL: agt_sandbox-4.0.1.tar.gz
- Upload date:
- Size: 91.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: RestSharp/106.13.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e035b47eb6d7dfe1ea99ff87c3fa53eac3ad994424177c5fccf67b706f47005c
|
|
| MD5 |
42b9bb3d41d5f14f21f1ef4f8891da9f
|
|
| BLAKE2b-256 |
4df4b7f5e979652959d008512ce78cc9d7f6d2d1dbb27e910aa183779f5c7e8b
|
File details
Details for the file agt_sandbox-4.0.1-py3-none-any.whl.
File metadata
- Download URL: agt_sandbox-4.0.1-py3-none-any.whl
- Upload date:
- Size: 49.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: RestSharp/106.13.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc52a6357ef2c26990e76889dfae4abce9a8c0106cf1f344b69169faba5beaa3
|
|
| MD5 |
94265aa225e32ecb2fc06ad1dd8190ad
|
|
| BLAKE2b-256 |
cddb9ea5148bb2b2871da4275dd54e6eb7e74b8a0b1f6a352d48bac0ece4e37a
|