Local-first playground for experimenting with simple AI agents and sandboxed Python execution.
Project description
Agent Playground
Agent Playground is a local-first, lightweight, educational framework for understanding and debugging simple AI agents.
It combines four focused directions:
- specific multi-agent collaboration patterns
- readable agent framework code
- structured trace and timeline debugging
- sandboxed Python execution for model-generated code
The public concepts are intentionally small: Agent, ModelAdapter, Tool,
Sandbox, Trace, Run, and ReviewTeam.
What This Project Is
Agent Playground is built for two vertical slices:
- Single Agent + Python Sandbox + Trace: ask an agent to produce Python code, execute it in an isolated subprocess sandbox, and inspect the trace.
- Generate -> Review -> Refine: run a fixed three-agent collaboration pattern with a shared trace and a structured reviewer decision.
It is not a RAG system, long-term memory layer, arbitrary agent graph runtime, workflow DSL, plugin marketplace, hosted service, or web dashboard.
Architecture
flowchart TD
U["User / examples / CLI"] --> A["Agent.run(task)"]
A --> MA["ModelAdapter"]
FM["FakeModelAdapter"] -. tests .-> MA
OA["OpenAI-compatible adapter"] -. real model .-> MA
A --> PT["PythonTool"]
PT --> SB["Sandbox"]
SB --> SS["SubprocessSandbox"]
SB --> DS["DockerSandbox"]
PT --> OUT["ToolResult<br/>stdout / stderr / exit code / artifacts"]
A --> TR["TraceRecorder"]
MA --> TR
OUT --> TR
TR --> TJ["Trace JSON"]
TR --> TV["Timeline / events view"]
RT["ReviewTeam<br/>Generate -> Review -> Refine"] --> G["generator Agent"]
RT --> R["reviewer Agent<br/>JSON decision"]
RT --> RF["refiner Agent"]
G --> TR
R --> TR
RF --> TR
ReviewTeam is a fixed collaboration pattern over normal Agent instances.
It uses one shared trace so model calls, reviewer decisions, tool execution,
and sandbox policy metadata can be inspected together.
Installation
The package name is agent-playground; the Python import name is
agent_playground. The package is published on PyPI:
agent-playground.
Use As A Package
Use this when you want Agent Playground as a dependency in another Python project.
If you cloned this repository to develop Agent Playground itself, skip this section and use Develop This Repository.
# uv
uv add agent-playground
# pip
python -m pip install agent-playground
Check the installed package:
python -c "import agent_playground; print(agent_playground.__version__)"
If your package mirror has not synchronized the latest release yet, use the official PyPI index:
# uv
uv add --index-url https://pypi.org/simple agent-playground
# pip
python -m pip install --index-url https://pypi.org/simple agent-playground
Develop This Repository
Use this when you cloned the GitHub repository and want to modify the framework, examples, tests, or documentation.
You do not need to install the PyPI package in this case. The commands below use the local source checkout.
git clone https://github.com/xiao-sober/Agent-Playground.git
cd Agent-Playground
With uv from the source checkout:
uv sync --extra dev
uv run pytest -q
uv run python examples/01_single_agent_python.py
uv run python examples/03_generate_review_refine.py
uv run python examples/06_docker_sandbox.py
uv run python examples/07_trace_readability.py
With pip from the source checkout:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -e ".[dev]"
python -m pytest -q
python examples/01_single_agent_python.py
python examples/03_generate_review_refine.py
python examples/06_docker_sandbox.py
python examples/07_trace_readability.py
Use Local Changes From Another Project
Use this only when a different local project should test unpublished changes from this checkout before you publish a new version.
# uv
uv add --editable G:\Agent-Playground
# pip
python -m pip install -e G:\Agent-Playground
Minimal Single Agent
from agent_playground import Agent, FakeModelAdapter, PythonTool, SubprocessSandbox
model = FakeModelAdapter("""```python
print("hello from sandbox")
```""")
agent = Agent(
name="coder",
model=model,
tools=[PythonTool(SubprocessSandbox(timeout_seconds=3))],
)
run = agent.run("Write and run a tiny Python program.")
print(run.output)
run.trace.print_timeline()
run.trace.export_json("trace.json")
Generate-Review-Refine
from agent_playground import Agent, FakeModelAdapter, ReviewTeam
generator = Agent("generator", FakeModelAdapter("initial answer"))
reviewer = Agent(
"reviewer",
FakeModelAdapter(
'{"status": "pass", "issues": [], "suggestions": [], "reason": "ok"}'
),
)
refiner = Agent("refiner", FakeModelAdapter("refined answer"))
team = ReviewTeam(generator=generator, reviewer=reviewer, refiner=refiner)
run = team.run("Create a small solution and review it.")
print(run.status)
print(run.output)
run.trace.print_timeline()
Provider Config
Real model access is configured with a local provider JSON file. The project does
not automatically read .env; use the api_key field in the provider config.
Create a local config from the template:
Copy-Item config/providers.example.json config/providers.local.json
notepad config/providers.local.json
Expected structure:
{
"providers": {
"bailian": {
"name": "bailian",
"api_type": "openai",
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"api_key": "replace-with-your-api-key",
"models": {
"default": "qwen3.6-plus",
"single_agent": "qwen3.6-plus",
"generator": "qwen3.6-plus",
"reviewer": "qwen3.6-plus",
"refiner": "qwen3.6-plus"
}
}
}
}
Run Qwen/Bailian examples:
# uv
uv run python examples/04_qwen_single_agent.py
uv run python examples/05_qwen_generate_review_refine.py
# pip
python examples/04_qwen_single_agent.py
python examples/05_qwen_generate_review_refine.py
Or call a configured model from the CLI:
agent-playground run --provider bailian --model-alias single_agent "Write Python code that prints 1 + 1."
agent-playground run --provider bailian --model-alias single_agent --sandbox docker "Write Python code that prints 1 + 1."
Trace
Every run produces a structured Trace. It records model calls, tool calls,
stdout, stderr, exit codes, errors, artifacts, timeline events, team rounds,
agent roles, reviewer decisions, and non-secret sandbox policy metadata.
run.trace.print_timeline()
run.trace.export_json("trace.json")
Static HTML viewer:
from agent_playground import export_trace_html
export_trace_html(run.trace, "trace.html")
CLI trace views:
agent-playground trace traces/example.json
agent-playground trace traces/example.json --timeline
agent-playground trace traces/example.json --events
agent-playground trace traces/example.json --html trace.html
For a trace-focused example:
python examples/07_trace_readability.py
The HTML viewer is a single local file with inline CSS. It does not start a server and does not require a JavaScript framework.
Sandbox
SubprocessSandbox runs generated Python in a separate process with an isolated
workspace, timeout, minimal environment, stdout/stderr capture, exit code
capture, and common write guards. Its policy metadata is recorded into trace
tool calls so the execution boundary is visible during debugging.
It is isolated for local experimentation, not a production-grade security sandbox.
DockerSandbox is available for local Docker-based experiments when Docker is
installed. It runs Python in python:3.11-slim by default, disables container
network access, applies memory/CPU/pids limits, and uses a read-only container
root filesystem with /workspace mounted for the task.
from agent_playground import DockerSandbox, PythonTool
tool = PythonTool(DockerSandbox(timeout_seconds=5))
Build Checks
# uv
uv run python -m build
uv run python -m twine check dist/*
# pip
python -m build
python -m twine check dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_playground-0.1.1.tar.gz.
File metadata
- Download URL: agent_playground-0.1.1.tar.gz
- Upload date:
- Size: 41.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe99f0db9c411b8a616a069db44b495a239b9f58e78bee7a7f7a1a001ad06936
|
|
| MD5 |
3530cf8afa99fc3b6bf1a92cdd8f9d30
|
|
| BLAKE2b-256 |
e27941b57f3992a462fcc44668ad5e05da86a8c57dd06e1195ab7c2d3573f203
|
File details
Details for the file agent_playground-0.1.1-py3-none-any.whl.
File metadata
- Download URL: agent_playground-0.1.1-py3-none-any.whl
- Upload date:
- Size: 39.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91d5d83e300d1f29c5d4743cce878a20841dab9b351e186912e1fd19c5b8ac32
|
|
| MD5 |
ab74bbfb651f578a594a458ffde1dc50
|
|
| BLAKE2b-256 |
89ebfe9b7feb0e655ddb2fde87e904e5aaa0bce95d65ce750c3ac439e7133abb
|