Agent Sandbox plugin for Harbor — run Terminal-Bench / SWE-bench / Harbor benchmarks on Agent Sandbox pools
Project description
agent-sandbox-harbor
A Harbor environment plugin that runs Harbor benchmarks (Terminal-Bench, SWE-bench, custom datasets) on Agent Sandbox pre-warmed pools — no fork of Harbor required.
Highlights:
- Zero Harbor source changes. Plugs into Harbor via the official
--environment-import-pathextension point. - Skips Template Build. Agent Sandbox uses a pre-warmed Pod pool with in-place image swap,
so the per-task Template Build step that E2B / Novita require is replaced by a single
POST /v1/sandboxescall. - Internal-mirror friendly. A configurable image-prefix rewrites
docker.io/...to your private Distribution / Harbor registry. - Bring-your-own image. An optional task-name → image map (
AGBX_IMAGE_MAP) lets you run pre-built images for any dataset — including ones whosetask.tomlhas nodocker_image(e.g. SWE-bench, where the task is a Dockerfile).
Installation
pip install 'harbor[e2b]' agent-sandbox-harbor
The plugin pulls agent-sandbox-e2b as a hard
dependency (it calls patch_e2b() at import). harbor is an optional peer dependency, so the
package can be inspected / unit-tested without it; in real usage you install harbor[e2b]
yourself.
Quick start
# 1. Set credentials (one-off)
cat > agentbox.env <<'EOF'
E2B_API_KEY=agbx_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
E2B_DOMAIN=agent-sandbox-data-plane.example.com/agent-sandbox/api/data
E2B_API_URL=https://agent-sandbox-data-plane.example.com/agent-sandbox/api/e2b
AGBX_CLUSTER_ID=cluster-a
AGBX_POOL_NAME=terminal-bench-pool
AGBX_IMAGE_PREFIX=registry.internal/agent-sandbox
EOF
# 2. Run Harbor (use the plugin via the official --environment-import-path flag)
harbor run \
-d terminal-bench@2.0 \
-a oracle \
--environment-import-path agent_sandbox_harbor:AgentSandboxEnvironment \
-n 16 -y \
--env-file agentbox.env
Configuration
| Variable | Required | Description |
|---|---|---|
E2B_API_KEY |
yes | Agent Sandbox API key (agbx_...). |
AGBX_POOL_NAME |
yes | Pre-warmed pool name. |
E2B_DOMAIN |
no | Data-plane gateway, host[:port][/path]. Default is the in-cluster service. |
E2B_API_URL |
no | E2B-compatible control-plane URL, including scheme. |
AGBX_CLUSTER_ID |
no | Cluster id prefix (e.g. cluster-a). Omit for single-cluster setups. |
AGBX_IMAGE_MAP |
no | Path to a <task-name> <image> map file (one per line; = also accepted). If a task matches, that image is used verbatim. See Image selection. |
AGBX_IMAGE_PREFIX |
no | Mirror prefix applied to the task's docker_image (e.g. registry.internal/agent-sandbox). docker.io/ is stripped first. Not applied to AGBX_IMAGE_MAP values. |
AGBX_IMAGE_TAG |
no | Override the tag of the task's docker_image after rewriting. Not applied to AGBX_IMAGE_MAP values. |
AGBX_HTTPS |
no | true/false for the data-plane scheme (default true). |
AGBX_STARTUP_TIMEOUT |
no | Sandbox startup timeout, seconds (default 300). |
AGBX_READY_TIMEOUT |
no | Cold-image readiness ceiling, seconds (default 600). Large images (e.g. SWE-bench) may need more. |
e2b SDK ≥ 2.24: newer e2b SDKs reject non-
e2b_API keys client-side. Useagent-sandbox-e2b >= 0.0.4, whosepatch_e2b()neutralizes that check soagbx_keys work (needed when running onharbor >= 0.13, which pulls a newer e2b).
Image selection
The image for each task is chosen in this order:
-
AGBX_IMAGE_MAPentry — if the file maps the task name (Harbor'senvironment_name, i.e. the task / instance id) to an image, that image is used verbatim. This is how you run datasets whosetask.tomlhas nodocker_image(e.g. SWE-bench): pre-build / mirror the images once, list them here.# <task-name> <image-ref> astropy__astropy-7606 registry.internal/agentbox/swebench/sweb.eval.x86_64.astropy_1776_astropy-7606:260328 django__django-11265 registry.internal/agentbox/swebench/sweb.eval.x86_64.django_1776_django-11265:260328
-
task.tomldocker_image— if there's no map entry but the task sets[environment] docker_image(e.g. Terminal-Bench), that image is used, after optionalAGBX_IMAGE_PREFIX/AGBX_IMAGE_TAGrewriting. -
Otherwise the task is rejected. This environment only runs pre-built images — it does not build images from a Dockerfile and does not mutate a running sandbox. Datasets that ship a Dockerfile (with extra
RUNlayers) must be built/mirrored ahead of time and listed inAGBX_IMAGE_MAP.
Example: SWE-bench (Dockerfile-based dataset)
# 1. Pre-build the images the dataset's Dockerfile would produce (base + your overlay),
# push them to your registry, and write a map file:
# astropy__astropy-7606 registry.internal/.../sweb.eval.x86_64.astropy_1776_astropy-7606:<tag>
# ...
# 2. Point the plugin at it and run:
harbor run \
-d swebench-verified@1.0 \
-a oracle \
--environment-import-path agent_sandbox_harbor:AgentSandboxEnvironment \
--env-file swebench.env # contains AGBX_IMAGE_MAP=swebench_image_map.txt
How it works
AgentSandboxEnvironment subclasses Harbor's E2BEnvironment and overrides three methods:
_does_template_exist→ always returnsTrue_create_template→ no-op_create_sandbox→ callsAsyncSandbox.create(template="cluster::pool//image", secure=False, ...)
__init__ calls super().__init__() first, so Harbor's stock Dockerfile parsing still runs
(and sets self._workdir from the image's WORKDIR). The constructor then resolves the image
(see Image selection) and overrides self._template_name with the Agent
Sandbox pool shorthand cluster::pool//image.
At module import, patch_e2b() from
agent-sandbox-e2b redirects the e2b SDK to
your Agent Sandbox endpoints.
See INTEGRATION.md for full design notes, the --environment-import-path
mechanism explanation, and operational guidance.
Compatibility
Each release build is tested against the latest published versions of
harbor and e2b. The pinned upper bound in [project.optional-dependencies] is updated
automatically by the release CI to reflect the highest verified harbor version.
License
Apache 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_sandbox_harbor-0.0.4.tar.gz.
File metadata
- Download URL: agent_sandbox_harbor-0.0.4.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff450a0b01389c76dce691e649c8f7936d62aa3d30db4aec3df1722768eea526
|
|
| MD5 |
087690b76ec0595b452ed94affc78c95
|
|
| BLAKE2b-256 |
997292ce628ee3db9bfd7da0094940df417f8165261ce328467bacf7faec9cd1
|
Provenance
The following attestation bundles were made for agent_sandbox_harbor-0.0.4.tar.gz:
Publisher:
sdk-python-harbor-publish.yml on scitix/Agent-Sandbox
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_sandbox_harbor-0.0.4.tar.gz -
Subject digest:
ff450a0b01389c76dce691e649c8f7936d62aa3d30db4aec3df1722768eea526 - Sigstore transparency entry: 1690630549
- Sigstore integration time:
-
Permalink:
scitix/Agent-Sandbox@d43bb2d313dee3bb8b5db0c8dae92b369e97a8f1 -
Branch / Tag:
refs/tags/v0.0.4 - Owner: https://github.com/scitix
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
sdk-python-harbor-publish.yml@d43bb2d313dee3bb8b5db0c8dae92b369e97a8f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agent_sandbox_harbor-0.0.4-py3-none-any.whl.
File metadata
- Download URL: agent_sandbox_harbor-0.0.4-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ef1d1a540e04ad574a0eaf6483a48065024b539b0dbe121508a719206735447
|
|
| MD5 |
e66526cf12fed118c4ef6a4f74b2b3ca
|
|
| BLAKE2b-256 |
08e2785bd3357e8b1c60266be56e9e0241401fdcb7ea2cb653cb3b82145a6cf5
|
Provenance
The following attestation bundles were made for agent_sandbox_harbor-0.0.4-py3-none-any.whl:
Publisher:
sdk-python-harbor-publish.yml on scitix/Agent-Sandbox
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_sandbox_harbor-0.0.4-py3-none-any.whl -
Subject digest:
2ef1d1a540e04ad574a0eaf6483a48065024b539b0dbe121508a719206735447 - Sigstore transparency entry: 1690630569
- Sigstore integration time:
-
Permalink:
scitix/Agent-Sandbox@d43bb2d313dee3bb8b5db0c8dae92b369e97a8f1 -
Branch / Tag:
refs/tags/v0.0.4 - Owner: https://github.com/scitix
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
sdk-python-harbor-publish.yml@d43bb2d313dee3bb8b5db0c8dae92b369e97a8f1 -
Trigger Event:
push
-
Statement type: