A lightweight framework that connects LLMs to a virtual computer (Docker-based sandbox) to build general-purpose agents

These details have not been verified by PyPI

Project links

Project description

LLM-in-Sandbox

A lightweight framework that connects LLMs to a virtual computer (Docker-based sandbox) to build general-purpose agents.

Features:

🌍 General-purpose: works beyond coding—scientific reasoning, long-cotext understanding, video production, travel planning, and more
🐳 Isolated execution environment via Docker containers
🔌 Compatible with OpenAI, Anthropic, and self-hosted servers (vLLM, SGLang, etc.)
📁 Flexible I/O: mount any input files, export any output files

Installation

Requirements: Python 3.10+, Docker

git clone https://github.com/llm-in-sandbox/llm-in-sandbox.git
cd llm-in-sandbox
pip install -e .

Docker Image

The default Docker image (cdx123/llm-in-sandbox:v0.1) will be automatically pulled when you first run the agent. The first run may take a minute to download the image (~400MB), but subsequent runs will start instantly.

Advanced: Build your own image

Modify Dockerfile and build your own image:

llm-in-sandbox build
# Then use: --docker_image llm-in-sandbox:v0.1

Quick Start

LLM-in-Sandbox works with various LLM providers including OpenAI, Anthropic, and self-hosted servers (vLLM, SGLang, etc.).

Option 1: Cloud / API Services

llm-in-sandbox run \
    --query "write a hello world in python" \
    --llm_name "openai/gpt-5" \
    --llm_base_url "http://your-api-server/v1" \
    --api_key "your-api-key"

Option 2: Self-Hosted Models

Using local vLLM server for Qwen3-Coder-30B-A3B-Instruct

1. Start vLLM server:

vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct \
    --served-model-name qwen3_coder \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder \
    --tensor-parallel-size 8

2. Run agent (in a new terminal once server is ready):

llm-in-sandbox run \
    --query "write a hello world in python" \
    --llm_name qwen3_coder \
    --llm_base_url "http://localhost:8000/v1"  \
    --temperature 0.7

Using local SGLang server for DeepSeek-V3.2-Thinking

1. Start sgLang server:

python3 -m sglang.launch_server \
    --model-path "deepseek-ai/DeepSeek-V3.2" \
    --served-model-name "DeepSeek-V3.2" \
    --trust-remote-code \
    --tp-size 8 \
    --tool-call-parser deepseekv32 \
    --reasoning-parser deepseek-v3 \
    --host 0.0.0.0 \
    --port 5678

2. Run agent (in a new terminal once server is ready):

llm-in-sandbox run \
    --query "write a hello world in python" \
    --llm_name DeepSeek-V3.2 \
    --llm_base_url "http://0.0.0.0:5678/v1" \
    --extra_body '{"chat_template_kwargs": {"thinking": True}}'

Parameters (Common)

Parameter	Description	Default
`--query`	Task for the agent	required
`--llm_name`	Model name	required
`--llm_base_url`	API endpoint URL	from LLM_BASE_URL env var
`--api_key`	API key (not needed for local server)	from OPENAI_API_KEY env var
`--input_dir`	Input files folder to mount (Optional)	None
`--output_dir`	Output folder for results	`./output`
`--docker_image`	Docker image to use	`cdx123/llm-in-sandbox:v0.1`
`--prompt_config`	Path to prompt template	`./config/general.yaml`
`--temperature`	Sampling temperature	`1.0`
`--max_steps`	Max conversation turns	`100`
`--extra_body`	Extra JSON body for LLM API calls	None

Run llm-in-sandbox run --help for all available parameters.

Output

Each run creates a timestamped folder:

output/2026-01-16_14-30-00/
├── files/
│   ├── answer.txt      # Final answer
│   └── hello_world.py  # Output file
└── trajectory.json     # Execution history

More Examples

We provide examples across diverse non-coding domains: travel planning, video production, music composition, poster design, and more.

👉 See examples/README.md for the full list.

Contact Us

Daixuan Cheng: daixuancheng6@gmail.com
Shaohan Huang: shaohanh@microsoft.com

Acknowledgment

We learned the design and reused code from R2E-Gym. Thanks for the great work!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Feb 11, 2026

This version

0.1.0

Jan 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_in_sandbox-0.1.0.tar.gz (26.4 kB view details)

Uploaded Jan 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_in_sandbox-0.1.0-py3-none-any.whl (31.2 kB view details)

Uploaded Jan 22, 2026 Python 3

File details

Details for the file llm_in_sandbox-0.1.0.tar.gz.

File metadata

Download URL: llm_in_sandbox-0.1.0.tar.gz
Upload date: Jan 22, 2026
Size: 26.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for llm_in_sandbox-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9d093b7b85e17a2cdf6f17a41d5bd25f302ef5d9608f92414231b7cf53e43b65`
MD5	`1535cf086895611841750b4b978d8436`
BLAKE2b-256	`18a41d510e782faa93dcc531632e6aa25944639e4df2789f5345b3ac157dadcd`

See more details on using hashes here.

File details

Details for the file llm_in_sandbox-0.1.0-py3-none-any.whl.

File metadata

Download URL: llm_in_sandbox-0.1.0-py3-none-any.whl
Upload date: Jan 22, 2026
Size: 31.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for llm_in_sandbox-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1919486e15149ee30b7ed992b529ad6e31bff5d86128495ebcef6448014cdca8`
MD5	`4c57a280f43415d4801f1984384ef731`
BLAKE2b-256	`8e3a4b849a8864fad1067e175e2b10bad39fae5fa441f88f6359dd8499d204a9`

See more details on using hashes here.

llm-in-sandbox 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM-in-Sandbox

Installation

Quick Start

Option 1: Cloud / API Services

Option 2: Self-Hosted Models

Parameters (Common)

Output

More Examples

Contact Us

Acknowledgment

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes