FastMCP server: run Python in auditable Marimo notebooks
Project description
marimo-sandbox
A FastMCP server that runs Python code inside auditable Marimo
notebooks. Every execution is saved as a human-readable .py file you can open,
inspect, and re-run at any time.
Why
When an AI agent (Claude Code, etc.) runs Python on your behalf, you get back stdout and maybe a traceback. You can't see the full code in context, can't re-run it, can't modify it interactively.
marimo-sandbox fixes this by wrapping every execution in a Marimo notebook:
- Auditable — the exact code that ran is saved as a
.pyfile alongside its output - Viewable —
marimo edit <notebook>opens it in the browser with reactive cells - Re-runnable — the notebook is standalone;
python notebook.pyworks without the server - Persistent — all runs stored in SQLite with stdout, stderr, status, code hash, and artifacts
- Safe — static risk analysis runs before every execution; critical patterns can require approval
Install
pip install marimo-sandbox
# or with uv:
uv pip install marimo-sandbox
Requires Python 3.11+ and marimo:
pip install marimo
Add to Claude Code
claude mcp add marimo-sandbox -- python -m marimo_sandbox
Or with uv:
claude mcp add marimo-sandbox -- uvx marimo-sandbox
Set a custom data directory (where notebooks and the database are stored):
claude mcp add marimo-sandbox \
-e MARIMO_SANDBOX_DIR=/your/preferred/path \
-- python -m marimo_sandbox
Tools (17)
run_python
Run Python code and get back results + a notebook you can open.
code Python source to execute
description Short label for this run (shown in list_runs)
timeout_seconds Max execution time (default 60)
sandbox Run in Docker with --network=none (default False)
packages PyPI packages to install before running (e.g. ["pandas", "httpx"])
dry_run If True, return static risk analysis only — do not execute (default False)
require_approval If True, block execution when critical risk patterns are found (default False)
Returns: run_id, status, stdout, stderr, error, duration_ms, notebook_path,
view_command, code_hash, artifacts, and optionally risk_findings, packages_installed,
freeze.
Packages are installed via uv pip install when uv is available, falling back to pip. A
full pip freeze snapshot is captured after installation and stored with the run.
Structured outputs
Your code can expose typed data to agents via the __outputs__ dict:
import pandas as pd
df = pd.read_csv("data.csv")
__outputs__["summary"] = df.describe().to_dict()
__outputs__["row_count"] = len(df)
Retrieve these values later with get_run_outputs.
Static risk analysis
Every call to run_python runs an AST-based risk scan before execution. Findings appear
in risk_findings in the response. Use dry_run=True to get the analysis without running:
risk_findings severity tiers:
critical subprocess calls, os.system/popen, eval/exec/compile
high dangerous imports (os, subprocess, socket, requests, …)
medium open() with write/append mode
low os.environ[] access
Use require_approval=True to block execution when critical patterns are found. The response
will include an approval_token — pass it to approve_run to proceed.
approve_run
Confirm a blocked run and execute it. Tokens expire after 1 hour.
approval_token Token returned by run_python when status='awaiting_confirmation'
reason Optional note explaining the approval
list_pending_approvals
List all runs currently awaiting approval, including expiry status and critical finding count.
list_artifacts
List files created by a run's code (everything in the notebook directory except the notebook itself and the result sidecar).
run_id Run to inspect
Returns artifact_count and artifacts — each entry has path, size_bytes, extension.
read_artifact
Read the content of an artifact file. Path traversal is rejected. Large files are refused (default limit: 5 MB).
run_id Run that created the file
artifact_path Relative path from list_artifacts
max_size_bytes Size limit in bytes (default 5 000 000)
Returns content (str) for text files or content_base64 for binary files, plus
media_type, size_bytes, is_text.
get_run_outputs
Retrieve the structured __outputs__ dict written by the run. Returns {} if the
run hasn't completed successfully or didn't populate __outputs__.
run_id Run to read outputs from
rerun
Re-execute a previous run's code by run_id, optionally with modifications.
run_id Run to re-execute
code Override the code (default: use original)
description Override the description (default: original + " (rerun)")
timeout_seconds Max execution time (default 60)
sandbox Run in Docker sandbox (default False)
packages PyPI packages to install (default: reuse original run's packages)
open_notebook
Open a previous run in Marimo's interactive editor.
run_id ID returned by run_python
port Local port for the editor (default 2718)
Returns a url to open in your browser. You can then edit cells and re-run them.
cancel_run
Cancel a run that is currently executing (async_mode=True). Sends SIGTERM to the
process and marks the run as cancelled in the database.
run_id The run to cancel (must have status 'running')
Returns success, run_id, pid — or error if the run is not found or not running.
list_environments
List cached virtual environments (hash-based venv cache).
Each environment corresponds to a unique set of packages. Environments are reused
automatically when run_python is called with the same package list.
Returns count and environments — each entry has env_hash, packages,
size_bytes, created_at, last_used_at.
clean_environments
Delete cached virtual environments that haven't been used recently.
older_than_days Delete envs whose last_used_at is older than this many days (default 90)
Returns deleted_count, deleted_hashes, freed_bytes.
diff_runs
Compare two runs and explain what changed between them. By default compares run_id
against its parent run; supply compare_to to choose an explicit reference.
run_id The run to inspect (the "after" run)
compare_to ID of the reference run (the "before" run); defaults to run_id's parent
Returns run_a, run_b, relationship (parent_child / siblings / unrelated),
summary flags, code_diff (including diff_text), env_diff, artifact_diff,
output_diff, duration_diff, and a plain-English explanation.
list_runs
List recent runs with status, description, and timestamp.
limit Max results (default 20)
status Filter: 'success', 'error', or 'pending'
offset Number of runs to skip for pagination (default 0)
Returns total, count, offset, and runs.
get_run
Full details of a specific run, including the four provenance fields stored per run:
code_hash, env_hash, freeze, and risk_findings.
run_id Run to look up
include_code Include submitted code (default True)
include_notebook_source Include full .py notebook source (default False)
delete_run
Remove a run's database record and its notebook files from disk.
run_id Run to delete
delete_files Also remove the notebook directory (default True)
purge_runs
Bulk-delete runs older than N days to reclaim disk space.
older_than_days Delete runs older than this many days (default 30)
delete_files Also remove notebook directories (default True)
dry_run Preview what would be deleted without deleting (default False)
When dry_run=False returns deleted_runs, files_deleted, run_ids.
When dry_run=True returns dry_run=True, would_delete_runs, run_ids.
check_setup
Verify marimo, Docker, and uv are available and show the data directory.
Notebooks
Generated notebooks live at:
~/.marimo-sandbox/notebooks/{run_id}/notebook.py
Open any of them directly:
marimo edit ~/.marimo-sandbox/notebooks/run_a1b2c3d4/notebook.py
Or run headlessly:
python ~/.marimo-sandbox/notebooks/run_a1b2c3d4/notebook.py
A result sidecar is written alongside the notebook on success:
~/.marimo-sandbox/notebooks/{run_id}/{run_id}_result.json
Any other files your code writes to disk are captured as artifacts and
accessible via list_artifacts / read_artifact.
Sandbox mode (Docker)
For untrusted code, run_python(sandbox=True) runs inside Docker with:
--network=none— no outbound connections--memory=512m— memory cap--cpus=1— CPU cap--read-only— read-only root filesystem- writable
/sandboxmount for the notebook and result file
Build the sandbox image first:
docker build -f Dockerfile.sandbox -t marimo-sandbox:latest .
Add packages your code needs to Dockerfile.sandbox and rebuild.
Configuration
| Env var | Default | Description |
|---|---|---|
MARIMO_SANDBOX_DIR |
~/.marimo-sandbox |
Where notebooks and DB are stored |
MARIMO_SANDBOX_DOCKER_IMAGE |
marimo-sandbox:latest |
Docker image for sandbox mode |
Notebook structure
Every generated notebook has four fixed cells:
| Cell | Purpose |
|---|---|
__setup__ |
Imports marimo, returns (mo,) |
__context__ |
Displays run metadata (description, run_id, timestamp) |
__execution__ |
Initialises __outputs__: dict = {}; runs your code; returns (sandbox_executed, __outputs__) |
__record__ |
Depends on sandbox_executed and __outputs__ — only runs on success; writes result sidecar with outputs |
The __record__ → __execution__ dependency means: if your code raises an
exception, __record__ never runs (Marimo's DAG won't execute a cell whose
dependencies failed). The executor detects the missing sidecar and reports an
error with the captured stderr.
Development
# Install in editable mode with dev dependencies
uv pip install -e ".[dev]"
# Lint
ruff check src/ tests/
# Type check
mypy src/
# Unit tests (fast, no subprocess)
pytest tests/ -m "not slow" -v
# Integration tests (run real Marimo subprocesses)
pytest tests/ -m slow -v
Known limitations
- Top-level
returnstatements in submitted code are rejected (they exit the cell function before the sentinel is set). Wrap such code in a function. sys.exit()in user code is detected and reported as an error.- Generated notebooks always import
marimo. If marimo is not installed in the execution environment, the notebook will fail.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file marimo_sandbox-1.0.0.tar.gz.
File metadata
- Download URL: marimo_sandbox-1.0.0.tar.gz
- Upload date:
- Size: 46.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47bcdd8888ea4d890bcfb36d09a1fa027a8269f355287e6b9781a376099c4e83
|
|
| MD5 |
24925fc6ec25fefb9327df6d8d6b2921
|
|
| BLAKE2b-256 |
47c11fc1682394f122a2f84b5ba5682d94978c327482a6267ae7e49bc2ef0680
|
Provenance
The following attestation bundles were made for marimo_sandbox-1.0.0.tar.gz:
Publisher:
publish.yml on mfbaig35r/marimo-sandbox
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
marimo_sandbox-1.0.0.tar.gz -
Subject digest:
47bcdd8888ea4d890bcfb36d09a1fa027a8269f355287e6b9781a376099c4e83 - Sigstore transparency entry: 1311572465
- Sigstore integration time:
-
Permalink:
mfbaig35r/marimo-sandbox@e3937f44acd6fe89c9ef661b4c7a9f1e2be55502 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/mfbaig35r
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3937f44acd6fe89c9ef661b4c7a9f1e2be55502 -
Trigger Event:
push
-
Statement type:
File details
Details for the file marimo_sandbox-1.0.0-py3-none-any.whl.
File metadata
- Download URL: marimo_sandbox-1.0.0-py3-none-any.whl
- Upload date:
- Size: 31.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8d8463a757fc55527a415241b98359ca851e2ba23093ea27361da3ca1fff026
|
|
| MD5 |
f483115814c1298914a5244e89f86d3e
|
|
| BLAKE2b-256 |
5d563b8c95019498222755512bba240d90003d233fab120b0066da2c3e5d59fe
|
Provenance
The following attestation bundles were made for marimo_sandbox-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on mfbaig35r/marimo-sandbox
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
marimo_sandbox-1.0.0-py3-none-any.whl -
Subject digest:
c8d8463a757fc55527a415241b98359ca851e2ba23093ea27361da3ca1fff026 - Sigstore transparency entry: 1311572529
- Sigstore integration time:
-
Permalink:
mfbaig35r/marimo-sandbox@e3937f44acd6fe89c9ef661b4c7a9f1e2be55502 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/mfbaig35r
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3937f44acd6fe89c9ef661b4c7a9f1e2be55502 -
Trigger Event:
push
-
Statement type: