Local-first playground for experimenting with simple AI agents and sandboxed Python execution.

These details have not been verified by PyPI

Project description

Agent Playground

Agent Playground is a local-first, lightweight, educational framework for understanding and debugging simple AI agents.

It combines four focused directions:

specific multi-agent collaboration patterns
readable agent framework code
structured trace and timeline debugging
sandboxed Python execution for model-generated code

The public concepts are intentionally small: Agent, ModelAdapter, Tool, Sandbox, Trace, Run, and ReviewTeam.

What This Project Is

Agent Playground is built for two vertical slices:

Single Agent + Python Sandbox + Trace: ask an agent to produce Python code, execute it in an isolated subprocess sandbox, and inspect the trace.
Generate -> Review -> Refine: run a fixed three-agent collaboration pattern with a shared trace and a structured reviewer decision.

It is not a RAG system, long-term memory layer, arbitrary agent graph runtime, workflow DSL, plugin marketplace, hosted service, or web dashboard.

Architecture

flowchart TD
    U["User / examples / CLI"] --> A["Agent.run(task)"]

    A --> MA["ModelAdapter"]
    FM["FakeModelAdapter"] -. tests .-> MA
    OA["OpenAI-compatible adapter"] -. real model .-> MA

    A --> PT["PythonTool"]
    PT --> SB["Sandbox"]
    SB --> SS["SubprocessSandbox"]
    SB --> DS["DockerSandbox"]
    PT --> OUT["ToolResult<br/>stdout / stderr / exit code / artifacts"]

    A --> TR["TraceRecorder"]
    MA --> TR
    OUT --> TR
    TR --> TJ["Trace JSON"]
    TR --> TV["Timeline / events view"]

    RT["ReviewTeam<br/>Generate -> Review -> Refine"] --> G["generator Agent"]
    RT --> R["reviewer Agent<br/>JSON decision"]
    RT --> RF["refiner Agent"]
    G --> TR
    R --> TR
    RF --> TR

ReviewTeam is a fixed collaboration pattern over normal Agent instances. It uses one shared trace so model calls, reviewer decisions, tool execution, and sandbox policy metadata can be inspected together.

Installation

The package name is agent-playground; the Python import name is agent_playground.

For development inside this repository, use either uv or pip.

With uv:

uv sync --extra dev
uv run pytest -q
uv run python examples/01_single_agent_python.py
uv run python examples/03_generate_review_refine.py
uv run python examples/06_docker_sandbox.py
uv run python examples/07_trace_readability.py

With pip:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -e ".[dev]"
python -m pytest -q
python examples/01_single_agent_python.py
python examples/03_generate_review_refine.py
python examples/06_docker_sandbox.py
python examples/07_trace_readability.py

Before a PyPI release exists, another local project can install this package from the checkout path:

# uv
uv add --editable G:\Agent-Playground

# pip
python -m pip install -e G:\Agent-Playground

After a PyPI release exists:

# uv
uv add agent-playground

# pip
python -m pip install agent-playground

Minimal Single Agent

from agent_playground import Agent, FakeModelAdapter, PythonTool, SubprocessSandbox

model = FakeModelAdapter("""```python
print("hello from sandbox")
```""")

agent = Agent(
    name="coder",
    model=model,
    tools=[PythonTool(SubprocessSandbox(timeout_seconds=3))],
)

run = agent.run("Write and run a tiny Python program.")

print(run.output)
run.trace.print_timeline()
run.trace.export_json("trace.json")

Generate-Review-Refine

from agent_playground import Agent, FakeModelAdapter, ReviewTeam

generator = Agent("generator", FakeModelAdapter("initial answer"))
reviewer = Agent(
    "reviewer",
    FakeModelAdapter(
        '{"status": "pass", "issues": [], "suggestions": [], "reason": "ok"}'
    ),
)
refiner = Agent("refiner", FakeModelAdapter("refined answer"))

team = ReviewTeam(generator=generator, reviewer=reviewer, refiner=refiner)
run = team.run("Create a small solution and review it.")

print(run.status)
print(run.output)
run.trace.print_timeline()

Provider Config

Real model access is configured with a local provider JSON file. The project does not automatically read .env; use the api_key field in the provider config.

Create a local config from the template:

Copy-Item config/providers.example.json config/providers.local.json
notepad config/providers.local.json

Expected structure:

{
  "providers": {
    "bailian": {
      "name": "bailian",
      "api_type": "openai",
      "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
      "api_key": "replace-with-your-api-key",
      "models": {
        "default": "qwen3.6-plus",
        "single_agent": "qwen3.6-plus",
        "generator": "qwen3.6-plus",
        "reviewer": "qwen3.6-plus",
        "refiner": "qwen3.6-plus"
      }
    }
  }
}

Run Qwen/Bailian examples:

# uv
uv run python examples/04_qwen_single_agent.py
uv run python examples/05_qwen_generate_review_refine.py

# pip
python examples/04_qwen_single_agent.py
python examples/05_qwen_generate_review_refine.py

Or call a configured model from the CLI:

agent-playground run --provider bailian --model-alias single_agent "Write Python code that prints 1 + 1."
agent-playground run --provider bailian --model-alias single_agent --sandbox docker "Write Python code that prints 1 + 1."

Trace

Every run produces a structured Trace. It records model calls, tool calls, stdout, stderr, exit codes, errors, artifacts, timeline events, team rounds, agent roles, reviewer decisions, and non-secret sandbox policy metadata.

run.trace.print_timeline()
run.trace.export_json("trace.json")

Static HTML viewer:

from agent_playground import export_trace_html

export_trace_html(run.trace, "trace.html")

CLI trace views:

agent-playground trace traces/example.json
agent-playground trace traces/example.json --timeline
agent-playground trace traces/example.json --events
agent-playground trace traces/example.json --html trace.html

For a trace-focused example:

python examples/07_trace_readability.py

The HTML viewer is a single local file with inline CSS. It does not start a server and does not require a JavaScript framework.

Sandbox

SubprocessSandbox runs generated Python in a separate process with an isolated workspace, timeout, minimal environment, stdout/stderr capture, exit code capture, and common write guards. Its policy metadata is recorded into trace tool calls so the execution boundary is visible during debugging.

It is isolated for local experimentation, not a production-grade security sandbox.

DockerSandbox is available for local Docker-based experiments when Docker is installed. It runs Python in python:3.11-slim by default, disables container network access, applies memory/CPU/pids limits, and uses a read-only container root filesystem with /workspace mounted for the task.

from agent_playground import DockerSandbox, PythonTool

tool = PythonTool(DockerSandbox(timeout_seconds=5))

Build Checks

# uv
uv run python -m build
uv run python -m twine check dist/*

# pip
python -m build
python -m twine check dist/*

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

May 22, 2026

This version

0.1.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_playground-0.1.0.tar.gz (37.6 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_playground-0.1.0-py3-none-any.whl (36.8 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file agent_playground-0.1.0.tar.gz.

File metadata

Download URL: agent_playground-0.1.0.tar.gz
Upload date: May 22, 2026
Size: 37.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for agent_playground-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ba7a65c6b772975fcaba0f0f3959a09e46834f7dc259ecdfbcae99ee66e5518d`
MD5	`57e99acaf050593f2b54800fc0605e97`
BLAKE2b-256	`ee0c8abaf4c84920940f7635e47a0537964dc38e1d98e7c9b3c540cf8cd0f963`

See more details on using hashes here.

File details

Details for the file agent_playground-0.1.0-py3-none-any.whl.

File metadata

Download URL: agent_playground-0.1.0-py3-none-any.whl
Upload date: May 22, 2026
Size: 36.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for agent_playground-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`71fa987bcb707be732c82f6c7be043786b21a25a411dd818d367d64d48306b92`
MD5	`bf465b7a597fa201c53529eb19d73001`
BLAKE2b-256	`af41994bf3727bacd246a7da8fc351750078f80a01e77869b91b8a260844c60b`

See more details on using hashes here.

agent-playground 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Agent Playground

What This Project Is

Architecture

Installation

Minimal Single Agent

Generate-Review-Refine

Provider Config

Trace

Sandbox

Build Checks

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes