A local-first session-based sandbox runtime for AI agents.

These details have not been verified by PyPI

Project description

session-based-sandbox

Local-first sandbox runtime: one HTTP session maps to one temp workdir, run bash or Python steps with timeouts, explicit sandbox_id routing, and DELETE to tear down.

Contributing: see CONTRIBUTING.md. License: MIT. Releases: see CHANGELOG.md. Detailed MVP design for implementers: Implementation specification below.

Quickstart

Requires Python 3.11+.

git clone <repository-url>
cd session-based-sandbox
python3 -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -U pip setuptools wheel
pip install -e ".[dev]"

sbs run
# same server: session-based-sandbox run

Then open http://127.0.0.1:8000/docs for interactive API documentation.

When the package is published to PyPI:

pip install session-based-sandbox
sbs run

HTTP API (current behavior)

Default base URL: http://127.0.0.1:8000.

`POST /sessions`

Creates a session and isolated working directory. Body: empty JSON object {} is fine.

Response: {"session_id": "<uuid>"}

`POST /sessions/{session_id}/step`

JSON body (required fields):

Field	Description
`sandbox_id`	Must equal `session_id` from the URL (explicit routing).
`type`	`"bash"` or `"python"`.
`payload`	Bash: `{"cmd": "<shell string>"}`. Python: `{"code": "<source passed to python -c>"}`.

Response: {"output": "...", "error": "...", "exit_code": <int>}
If the step exceeds the configured wall-clock limit, exit_code is 124 and error describes the timeout.

`DELETE /sessions/{session_id}`

204 with no body. Removes the session and its workdir. Later steps for that id return 404.

curl smoke (single-line friendly)

After sbs run, in another terminal:

BASE=http://127.0.0.1:8000 && SID=$(curl -sS -X POST "$BASE/sessions" | python3 -c "import sys,json; print(json.load(sys.stdin)['session_id'])") && curl -sS -X POST "$BASE/sessions/$SID/step" -H 'Content-Type: application/json' -d "{\"sandbox_id\":\"$SID\",\"type\":\"bash\",\"payload\":{\"cmd\":\"pwd\"}}" && echo && curl -sS -o /dev/null -w "DELETE %{http_code}\n" -X DELETE "$BASE/sessions/$SID"

CLI

Both console scripts call the same uvicorn app:

Entry point	Notes
`sbs run`	Short alias.
`session-based-sandbox run`	Same behavior as `sbs`.

Options: --host (default 127.0.0.1), --port (default 8000).

sbs run --help
session-based-sandbox run --host 127.0.0.1 --port 8001

Configuration

Variable	Meaning
`SBS_STEP_TIMEOUT_SEC`	Max seconds per step (default 30, minimum 1). Read when the server process starts; restart after changing.

Safety

Isolation is temp directories + subprocesses, not containers or VMs. Host resources (CPU, disk, network) can still be affected by malicious or heavy workloads.
Steps execute as the same OS user as the server, using host Python and /bin/bash (where available).
Timeouts and DELETE attempt to terminate the child process; behavior under concurrent load is best-effort for this MVP.
There is no authentication. Prefer binding to 127.0.0.1 and do not expose the API to untrusted networks without a separate auth layer.

Development setup

python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Repository scripts used in release checks:

bash scripts/verify_editable_install.sh
bash scripts/verify_cli_entrypoints.sh
bash scripts/verify_package_build.sh
bash scripts/verify_release_ready.sh

Testing

pytest tests/
pytest tests/ -v
pytest tests/ --cov=session_based_sandbox --cov-report=term-missing
pytest tests/system/test_cli_server_entrypoints.py -v

On GitHub, CI runs the same test suite on Python 3.11 and 3.12 for pushes/PRs to main or master (see .github/workflows/ci.yml).

Implementation specification (MVP design)

Project Goal

Build a production-quality MVP for a Session-Based Sandbox Runtime using Python + FastAPI.

This is a local-first execution runtime for AI agents that provides:

session-based execution
stateful sandbox environments
isolated runtime environments
explicit execution routing
deterministic behavior

This is NOT a full platform.

This is a minimal, robust, extensible open-source tool.

The goal is to let users or AI agents safely execute coding tasks, shell commands, and workflows inside isolated local sandboxes without polluting the host machine.

Product Context

User Scenario

A user wants an AI agent to help with:

coding
running scripts
installing packages
debugging
data analysis
executing terminal commands

But they do NOT want the agent directly operating inside the host machine environment.

They want a safer, simpler, more controllable local runtime.

User Pain Points

Pain Point 1

The user has no technical background.

They do not know what a sandbox is, but they want their AI agent to code for them.

Pain Point 2

The user has technical background, but existing tools are too complex to configure.

They do not want to spend hours configuring Docker / infra / orchestration.

Pain Point 3

The user has technical background, but they do not want agents running dangerous commands directly on their machine.

They want strong isolation and cleanup.

Pain Point 4

The user notices the agent repeatedly makes the same execution mistakes.

For example:

re-running failed scripts
repeating broken environment setup
retrying commands that already failed

They want session-based state and future persistent memory support.

What SBS Solves

sbs (session-based-sandbox) solves this by providing:

1. Easy Installation

Users can install via common package managers:

pip install session-based-sandbox

and later potentially:

npm install ...
brew install ...

(Phase 1 only requires Python packaging.)

2. Agent-Friendly Usage

Tools like:

ClaudeHub
Hermes
OpenHands
other coding agents

can read the SBS skill documentation and learn how to use it.

This makes SBS usable by both:

humans
AI agents

3. Safe Local Runtime

Users and agents can:

create isolated sessions
execute python/bash
preserve state inside the session
destroy the environment after completion

without polluting the host machine.

Core Design Principle

1 Session → 1 Sandbox

This is the most important architecture rule.

Correct Model

1 Session → 1 Sandbox

Meaning:

A single session owns exactly one sandbox.

Definitions

Session

Logical task lifecycle manager.

Responsible for:

lifecycle management
state management
step routing
execution history
cleanup trigger

Think of it as:

task controller

Sandbox

Actual execution environment.

Responsible for:

command execution
file isolation
subprocess management
cwd management
runtime isolation

Think of it as:

the actual worker machine

Important Rule

Session ≠ Sandbox

But:

Session owns exactly one Sandbox

which means:

1 Session → 1 Sandbox

Why This Rule Exists

Because shared environments cause chaos.

Bad example:

Session A:
pip install pandas==1.5

Session B:
pip install pandas==2.2

Result:

everything breaks

No determinism.

No safety.

No traceability.

No cleanup.

Benefits

Strong Isolation

Sessions do not affect each other.

Deterministic Behavior

Each task runs inside its own isolated environment.

Easy Cleanup

DELETE /sessions/{id}

removes the entire environment.

Stateful Execution

Same session can continue previous work.

Example:

yesterday installed packages
today still available

This is not stateless command execution.

This is stateful runtime execution.

MVP Scope

Build ONLY the minimum production-quality MVP.

Do NOT build a platform.

Do NOT over-engineer.

Do NOT add future features.

Tech Stack

Use:

Python 3.11+
FastAPI
Uvicorn
Pydantic
Pytest

Optional:

asyncio
subprocess
tempfile
pathlib
uuid
signal
shutil
logging

Do NOT use:

Docker
Celery
Redis
PostgreSQL
SQLAlchemy
Kubernetes
RabbitMQ
external infra

Everything must run fully on localhost.

Required Features

Implement ONLY the following.

1. Session Lifecycle

Create Session

Endpoint

POST /sessions

Behavior

Must:

create a new session
generate unique session_id
create exactly one local sandbox
create isolated working directory using tempfile.mkdtemp()
set session status = ACTIVE

Return

{
  "session_id": "uuid"
}

2. Step Execution

Endpoint

POST /sessions/{session_id}/step

Required Step Schema

Every request MUST include:

{
  "sandbox_id": "session_id",
  "type": "python | bash",
  "payload": {}
}

Validation Rules

Must enforce:

sandbox_id is mandatory
sandbox_id MUST equal session_id
otherwise return validation error

No implicit routing allowed.

Explicit execution target only.

Supported Step Types

Python

{
  "sandbox_id": "session_id",
  "type": "python",
  "payload": {
    "code": "print(123)"
  }
}

Bash

{
  "sandbox_id": "session_id",
  "type": "bash",
  "payload": {
    "cmd": "ls -la"
  }
}

Execution Requirements

Execution must:

run inside that session’s isolated cwd
capture stdout
capture stderr
capture exit_code
enforce timeout

If timeout occurs:

kill process
return timeout error clearly

Return

{
  "output": "...",
  "error": "...",
  "exit_code": 0
}

3. Close Session

Endpoint

DELETE /sessions/{session_id}

Behavior

Must:

mark session CLOSED
terminate alive subprocesses
delete temp working directory
block future execution for this session

Failure Modes (Must Handle)

Must explicitly handle:

Sandbox Crash

Return structured execution error.

Infinite Loop

Use timeout + force kill.

Closed Session

Execution must be blocked.

Sandbox Isolation

No cross-session shared state.

Resource Cleanup

Must destroy resources after close.

No:

orphan subprocesses
leaked temp directories

Project Structure

Use exactly this structure:

session-based-sandbox/
│
├── session_based_sandbox/
│   ├── cli.py
│   ├── server.py
│   │
│   ├── runtime/
│   │   ├── runtime.py
│   │   ├── session.py
│   │   ├── executor.py
│   │   ├── router.py
│   │   └── state.py
│   │
│   ├── sandbox/
│   │   └── local.py
│   │
│   └── api/
│       ├── http.py
│       └── ws.py
│
├── tests/
│   ├── unit/
│   ├── integration/
│   ├── system/
│   └── failure_modes/
│
└── pyproject.toml

Logging

Use simple structured logs for:

session_created
step_received
step_started
step_finished
execution_failed
session_closed

Requirements:

keep logging simple
standard logging only
no tracing system

Testing (Required)

Write real pytest tests.

No placeholder tests.

Tests must actually run.

Required Coverage

Must test:

session lifecycle
step routing correctness
sandbox isolation
timeout handling
crash handling
closed session execution blocked

Installation Requirements

Must support:

pip install -e .

and

pip install session-based-sandbox

Must work as:

local editable install
normal published package install

CLI Requirements

Must expose both commands:

session-based-sandbox run

and

sbs run

Both must start the same FastAPI server.

Default server:

http://127.0.0.1:8000

pyproject.toml Entry Points

Must define:

[project.scripts]
session-based-sandbox = "session_based_sandbox.cli:run"
sbs = "session_based_sandbox.cli:run"

No wrappers.

No extra launch layers.

Simple and explicit only.

Strong Constraints

Do NOT implement:

Docker sandbox
WebSocket streaming
persistent storage
distributed workers
tracing UI
SDK
auth system
user system
database
queue system
scheduler
background workers

These are future features.

They must be excluded.

Code Quality Rules

Code must be:

clean
typed
readable
maintainable
minimal
testable

Avoid:

giant files
hidden magic
unnecessary inheritance
speculative abstractions

Prefer:

explicit code
small modules
simple control flow

Deliverables

Must produce:

Full project code
All required tests
pyproject.toml
CLI runnable entrypoint
Proper package metadata for publishable installation

Must support:

pip install -e .
pip install session-based-sandbox

session-based-sandbox run
sbs run

Server must run at:

http://127.0.0.1:8000

Recommended Development Order

Build in this order:

1. Create project structure
2. pyproject.toml
3. cli.py
4. server.py
5. api/http.py skeleton
6. runtime/state.py
7. runtime/session.py
8. sandbox/local.py
9. runtime/executor.py
10. runtime/router.py
11. runtime/runtime.py
12. tests
13. local install validation
14. CLI validation
15. pytest validation

Final Requirement

This is the most important rule:

Build the MVP exactly.

Do not improve scope.

Do not add platform features.

Do not redesign architecture.

Strictly execute the specification.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

session_based_sandbox-0.1.0.tar.gz (18.4 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

session_based_sandbox-0.1.0-py3-none-any.whl (16.4 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file session_based_sandbox-0.1.0.tar.gz.

File metadata

Download URL: session_based_sandbox-0.1.0.tar.gz
Upload date: Apr 25, 2026
Size: 18.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for session_based_sandbox-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2b91b7e3b423d54e6fc7b8c9a3579a6428a03be329a33553b7e92bb29851a116`
MD5	`850cd5a197b301c9d87943f643744b07`
BLAKE2b-256	`2a92abb0a62e739c39409c9fcaf58271c149f487d65386d7c36edf2c686b80b1`

See more details on using hashes here.

File details

Details for the file session_based_sandbox-0.1.0-py3-none-any.whl.

File metadata

Download URL: session_based_sandbox-0.1.0-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 16.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for session_based_sandbox-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b655c59fa9b2a37b3edeffb84aa6660bb58d8d003330aa770b22f1eb45fb065c`
MD5	`4e36470f359627a48e6b0be589465229`
BLAKE2b-256	`043d35e1fa45065193e77701d52fdbb84a7424c2056a23f47c183cfc86549ca5`

See more details on using hashes here.

session-based-sandbox 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

session-based-sandbox

Quickstart

HTTP API (current behavior)

POST /sessions

POST /sessions/{session_id}/step

DELETE /sessions/{session_id}

curl smoke (single-line friendly)

CLI

Configuration

Safety

Development setup

Testing

Implementation specification (MVP design)

Project Goal

Product Context

User Scenario

User Pain Points

Pain Point 1

Pain Point 2

Pain Point 3

Pain Point 4

What SBS Solves

1. Easy Installation

2. Agent-Friendly Usage

3. Safe Local Runtime

Core Design Principle

1 Session → 1 Sandbox

Correct Model

Definitions

Session

Sandbox

Important Rule

Session ≠ Sandbox

Why This Rule Exists

Benefits

Strong Isolation

Deterministic Behavior

Easy Cleanup

Stateful Execution

MVP Scope

Tech Stack

Required Features

1. Session Lifecycle

Create Session

Endpoint

Behavior

Return

2. Step Execution

Endpoint

Required Step Schema

Validation Rules

Supported Step Types

Python

Bash

Execution Requirements

Return

3. Close Session

Endpoint

Behavior

Failure Modes (Must Handle)

Sandbox Crash

Infinite Loop

Closed Session

Sandbox Isolation

Resource Cleanup

Project Structure

Logging

Testing (Required)

Required Coverage

Installation Requirements

CLI Requirements

pyproject.toml Entry Points

`POST /sessions`

`POST /sessions/{session_id}/step`

`DELETE /sessions/{session_id}`