A local-first session-based sandbox runtime for AI agents.
Project description
session-based-sandbox
Local-first sandbox runtime: one HTTP session maps to one temp workdir, run bash or Python steps with timeouts, explicit sandbox_id routing, and DELETE to tear down.
Contributing: see CONTRIBUTING.md. License: MIT. Releases: see CHANGELOG.md. Detailed MVP design for implementers: Implementation specification below.
Quickstart
Requires Python 3.11+.
git clone <repository-url>
cd session-based-sandbox
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -U pip setuptools wheel
pip install -e ".[dev]"
sbs run
# same server: session-based-sandbox run
Then open http://127.0.0.1:8000/docs for interactive API documentation.
When the package is published to PyPI:
pip install session-based-sandbox
sbs run
HTTP API (current behavior)
Default base URL: http://127.0.0.1:8000.
POST /sessions
Creates a session and isolated working directory. Body: empty JSON object {} is fine.
Response: {"session_id": "<uuid>"}
POST /sessions/{session_id}/step
JSON body (required fields):
| Field | Description |
|---|---|
sandbox_id |
Must equal session_id from the URL (explicit routing). |
type |
"bash" or "python". |
payload |
Bash: {"cmd": "<shell string>"}. Python: {"code": "<source passed to python -c>"}. |
Response: {"output": "...", "error": "...", "exit_code": <int>}
If the step exceeds the configured wall-clock limit, exit_code is 124 and error describes the timeout.
DELETE /sessions/{session_id}
204 with no body. Removes the session and its workdir. Later steps for that id return 404.
curl smoke (single-line friendly)
After sbs run, in another terminal:
BASE=http://127.0.0.1:8000 && SID=$(curl -sS -X POST "$BASE/sessions" | python3 -c "import sys,json; print(json.load(sys.stdin)['session_id'])") && curl -sS -X POST "$BASE/sessions/$SID/step" -H 'Content-Type: application/json' -d "{\"sandbox_id\":\"$SID\",\"type\":\"bash\",\"payload\":{\"cmd\":\"pwd\"}}" && echo && curl -sS -o /dev/null -w "DELETE %{http_code}\n" -X DELETE "$BASE/sessions/$SID"
CLI
Both console scripts call the same uvicorn app:
| Entry point | Notes |
|---|---|
sbs run |
Short alias. |
session-based-sandbox run |
Same behavior as sbs. |
Options: --host (default 127.0.0.1), --port (default 8000).
sbs run --help
session-based-sandbox run --host 127.0.0.1 --port 8001
Configuration
| Variable | Meaning |
|---|---|
SBS_STEP_TIMEOUT_SEC |
Max seconds per step (default 30, minimum 1). Read when the server process starts; restart after changing. |
Safety
- Isolation is temp directories + subprocesses, not containers or VMs. Host resources (CPU, disk, network) can still be affected by malicious or heavy workloads.
- Steps execute as the same OS user as the server, using host Python and
/bin/bash(where available). - Timeouts and DELETE attempt to terminate the child process; behavior under concurrent load is best-effort for this MVP.
- There is no authentication. Prefer binding to
127.0.0.1and do not expose the API to untrusted networks without a separate auth layer.
Development setup
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
Repository scripts used in release checks:
bash scripts/verify_editable_install.sh
bash scripts/verify_cli_entrypoints.sh
bash scripts/verify_package_build.sh
bash scripts/verify_release_ready.sh
Testing
pytest tests/
pytest tests/ -v
pytest tests/ --cov=session_based_sandbox --cov-report=term-missing
pytest tests/system/test_cli_server_entrypoints.py -v
On GitHub, CI runs the same test suite on Python 3.11 and 3.12 for pushes/PRs to main or master (see .github/workflows/ci.yml).
Implementation specification (MVP design)
Project Goal
Build a production-quality MVP for a Session-Based Sandbox Runtime using Python + FastAPI.
This is a local-first execution runtime for AI agents that provides:
- session-based execution
- stateful sandbox environments
- isolated runtime environments
- explicit execution routing
- deterministic behavior
This is NOT a full platform.
This is a minimal, robust, extensible open-source tool.
The goal is to let users or AI agents safely execute coding tasks, shell commands, and workflows inside isolated local sandboxes without polluting the host machine.
Product Context
User Scenario
A user wants an AI agent to help with:
- coding
- running scripts
- installing packages
- debugging
- data analysis
- executing terminal commands
But they do NOT want the agent directly operating inside the host machine environment.
They want a safer, simpler, more controllable local runtime.
User Pain Points
Pain Point 1
The user has no technical background.
They do not know what a sandbox is, but they want their AI agent to code for them.
Pain Point 2
The user has technical background, but existing tools are too complex to configure.
They do not want to spend hours configuring Docker / infra / orchestration.
Pain Point 3
The user has technical background, but they do not want agents running dangerous commands directly on their machine.
They want strong isolation and cleanup.
Pain Point 4
The user notices the agent repeatedly makes the same execution mistakes.
For example:
- re-running failed scripts
- repeating broken environment setup
- retrying commands that already failed
They want session-based state and future persistent memory support.
What SBS Solves
sbs (session-based-sandbox) solves this by providing:
1. Easy Installation
Users can install via common package managers:
pip install session-based-sandbox
and later potentially:
npm install ...
brew install ...
(Phase 1 only requires Python packaging.)
2. Agent-Friendly Usage
Tools like:
- ClaudeHub
- Hermes
- OpenHands
- other coding agents
can read the SBS skill documentation and learn how to use it.
This makes SBS usable by both:
- humans
- AI agents
3. Safe Local Runtime
Users and agents can:
- create isolated sessions
- execute python/bash
- preserve state inside the session
- destroy the environment after completion
without polluting the host machine.
Core Design Principle
1 Session → 1 Sandbox
This is the most important architecture rule.
Correct Model
1 Session → 1 Sandbox
Meaning:
A single session owns exactly one sandbox.
Definitions
Session
Logical task lifecycle manager.
Responsible for:
- lifecycle management
- state management
- step routing
- execution history
- cleanup trigger
Think of it as:
task controller
Sandbox
Actual execution environment.
Responsible for:
- command execution
- file isolation
- subprocess management
- cwd management
- runtime isolation
Think of it as:
the actual worker machine
Important Rule
Session ≠ Sandbox
But:
Session owns exactly one Sandbox
which means:
1 Session → 1 Sandbox
Why This Rule Exists
Because shared environments cause chaos.
Bad example:
Session A:
pip install pandas==1.5
Session B:
pip install pandas==2.2
Result:
everything breaks
No determinism.
No safety.
No traceability.
No cleanup.
Benefits
Strong Isolation
Sessions do not affect each other.
Deterministic Behavior
Each task runs inside its own isolated environment.
Easy Cleanup
DELETE /sessions/{id}
removes the entire environment.
Stateful Execution
Same session can continue previous work.
Example:
yesterday installed packages
today still available
This is not stateless command execution.
This is stateful runtime execution.
MVP Scope
Build ONLY the minimum production-quality MVP.
Do NOT build a platform.
Do NOT over-engineer.
Do NOT add future features.
Tech Stack
Use:
- Python 3.11+
- FastAPI
- Uvicorn
- Pydantic
- Pytest
Optional:
- asyncio
- subprocess
- tempfile
- pathlib
- uuid
- signal
- shutil
- logging
Do NOT use:
- Docker
- Celery
- Redis
- PostgreSQL
- SQLAlchemy
- Kubernetes
- RabbitMQ
- external infra
Everything must run fully on localhost.
Required Features
Implement ONLY the following.
1. Session Lifecycle
Create Session
Endpoint
POST /sessions
Behavior
Must:
- create a new session
- generate unique
session_id - create exactly one local sandbox
- create isolated working directory using
tempfile.mkdtemp() - set session status = ACTIVE
Return
{
"session_id": "uuid"
}
2. Step Execution
Endpoint
POST /sessions/{session_id}/step
Required Step Schema
Every request MUST include:
{
"sandbox_id": "session_id",
"type": "python | bash",
"payload": {}
}
Validation Rules
Must enforce:
sandbox_idis mandatorysandbox_idMUST equalsession_id- otherwise return validation error
No implicit routing allowed.
Explicit execution target only.
Supported Step Types
Python
{
"sandbox_id": "session_id",
"type": "python",
"payload": {
"code": "print(123)"
}
}
Bash
{
"sandbox_id": "session_id",
"type": "bash",
"payload": {
"cmd": "ls -la"
}
}
Execution Requirements
Execution must:
- run inside that session’s isolated cwd
- capture stdout
- capture stderr
- capture exit_code
- enforce timeout
If timeout occurs:
- kill process
- return timeout error clearly
Return
{
"output": "...",
"error": "...",
"exit_code": 0
}
3. Close Session
Endpoint
DELETE /sessions/{session_id}
Behavior
Must:
- mark session CLOSED
- terminate alive subprocesses
- delete temp working directory
- block future execution for this session
Failure Modes (Must Handle)
Must explicitly handle:
Sandbox Crash
Return structured execution error.
Infinite Loop
Use timeout + force kill.
Closed Session
Execution must be blocked.
Sandbox Isolation
No cross-session shared state.
Resource Cleanup
Must destroy resources after close.
No:
- orphan subprocesses
- leaked temp directories
Project Structure
Use exactly this structure:
session-based-sandbox/
│
├── session_based_sandbox/
│ ├── cli.py
│ ├── server.py
│ │
│ ├── runtime/
│ │ ├── runtime.py
│ │ ├── session.py
│ │ ├── executor.py
│ │ ├── router.py
│ │ └── state.py
│ │
│ ├── sandbox/
│ │ └── local.py
│ │
│ └── api/
│ ├── http.py
│ └── ws.py
│
├── tests/
│ ├── unit/
│ ├── integration/
│ ├── system/
│ └── failure_modes/
│
└── pyproject.toml
Logging
Use simple structured logs for:
- session_created
- step_received
- step_started
- step_finished
- execution_failed
- session_closed
Requirements:
- keep logging simple
- standard logging only
- no tracing system
Testing (Required)
Write real pytest tests.
No placeholder tests.
Tests must actually run.
Required Coverage
Must test:
- session lifecycle
- step routing correctness
- sandbox isolation
- timeout handling
- crash handling
- closed session execution blocked
Installation Requirements
Must support:
pip install -e .
and
pip install session-based-sandbox
Must work as:
- local editable install
- normal published package install
CLI Requirements
Must expose both commands:
session-based-sandbox run
and
sbs run
Both must start the same FastAPI server.
Default server:
http://127.0.0.1:8000
pyproject.toml Entry Points
Must define:
[project.scripts]
session-based-sandbox = "session_based_sandbox.cli:run"
sbs = "session_based_sandbox.cli:run"
No wrappers.
No extra launch layers.
Simple and explicit only.
Strong Constraints
Do NOT implement:
- Docker sandbox
- WebSocket streaming
- persistent storage
- distributed workers
- tracing UI
- SDK
- auth system
- user system
- database
- queue system
- scheduler
- background workers
These are future features.
They must be excluded.
Code Quality Rules
Code must be:
- clean
- typed
- readable
- maintainable
- minimal
- testable
Avoid:
- giant files
- hidden magic
- unnecessary inheritance
- speculative abstractions
Prefer:
- explicit code
- small modules
- simple control flow
Deliverables
Must produce:
- Full project code
- All required tests
pyproject.toml- CLI runnable entrypoint
- Proper package metadata for publishable installation
Must support:
pip install -e .
pip install session-based-sandbox
session-based-sandbox run
sbs run
Server must run at:
http://127.0.0.1:8000
Recommended Development Order
Build in this order:
1. Create project structure
2. pyproject.toml
3. cli.py
4. server.py
5. api/http.py skeleton
6. runtime/state.py
7. runtime/session.py
8. sandbox/local.py
9. runtime/executor.py
10. runtime/router.py
11. runtime/runtime.py
12. tests
13. local install validation
14. CLI validation
15. pytest validation
Final Requirement
This is the most important rule:
Build the MVP exactly.
Do not improve scope.
Do not add platform features.
Do not redesign architecture.
Strictly execute the specification.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file session_based_sandbox-0.1.0.tar.gz.
File metadata
- Download URL: session_based_sandbox-0.1.0.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b91b7e3b423d54e6fc7b8c9a3579a6428a03be329a33553b7e92bb29851a116
|
|
| MD5 |
850cd5a197b301c9d87943f643744b07
|
|
| BLAKE2b-256 |
2a92abb0a62e739c39409c9fcaf58271c149f487d65386d7c36edf2c686b80b1
|
File details
Details for the file session_based_sandbox-0.1.0-py3-none-any.whl.
File metadata
- Download URL: session_based_sandbox-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b655c59fa9b2a37b3edeffb84aa6660bb58d8d003330aa770b22f1eb45fb065c
|
|
| MD5 |
4e36470f359627a48e6b0be589465229
|
|
| BLAKE2b-256 |
043d35e1fa45065193e77701d52fdbb84a7424c2056a23f47c183cfc86549ca5
|