Production-grade self-correcting AI agent platform with sandboxed execution

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ixchio

These details have not been verified by PyPI

Project description

🧠 Agent Sandbox Runtime

The Self-Correcting AI Agent with Swarm Intelligence

An open-source, production-grade AI agent platform that writes code, executes it safely, learns from failures, and self-corrects until it works.

🎬 See it in action

Swarm Intelligence Activating	Parallel Code Generation

Generated Solution	Mission Accomplished 🏆

📺 Video Demo

📖 Documentation · 🚀 Quick Start · 🏗️ Architecture · 🤝 Contributing

⚡ One-Click Deploy

🎯 Why This Exists

Most AI coding assistants generate code and hope it works. Agent Sandbox Runtime takes a fundamentally different approach:

You describe what you want → Agent writes code → Executes in Docker sandbox → 
If it fails → Analyzes the error → Rewrites with improvements → Repeats until success

This is Reflexion - the same self-improvement loop that makes humans good at coding. Combined with Swarm Intelligence (5 specialist AI agents reviewing each solution), you get code that actually works.

Real-world problems this solves:

🔄 "The AI gave me broken code" — Self-correction fixes bugs automatically
🔒 "I can't run untrusted code" — Docker isolation makes it safe
🐌 "AI suggestions are slow" — Groq inference at 743ms average
💸 "AI APIs are expensive" — Free tier models supported (Ollama, OpenRouter)

🏗️ System Architecture

The Reflexion Loop

This is the core innovation. Instead of generating code once, we generate → test → improve:

                    ┌─────────────────────────────────────────────────┐
                    │           REFLEXION LOOP (LangGraph)            │
                    │                                                 │
     Your Task ───► │  ┌──────────┐    ┌─────────┐    ┌─────────┐   │
                    │  │ GENERATE │───►│ EXECUTE │───►│ SUCCESS │───┼──► Result
                    │  │  (LLM)   │    │(Docker) │    │    ?    │   │
                    │  └──────────┘    └─────────┘    └────┬────┘   │
                    │       ▲                              │        │
                    │       │         ┌───────────┐        │ No     │
                    │       │         │  CRITIQUE │◄───────┘        │
                    │       │         │  (LLM)    │                 │
                    │       │         └─────┬─────┘                 │
                    │       │               │                       │
                    │       │         ┌─────▼─────┐                 │
                    │       └─────────┤   RETRY   │                 │
                    │                 │ (≤3 times)│                 │
                    │                 └───────────┘                 │
                    └─────────────────────────────────────────────────┘

Component Overview

Component	Purpose	Technology
Orchestrator	Manages the reflexion loop state machine	LangGraph
Generator	Produces Python code from natural language	LLM (6 providers)
Sandbox	Executes code in isolated Docker containers	Docker SDK
Critic	Analyzes failures and suggests improvements	LLM
Swarm	Multi-agent code review (Architect, Coder, Critic, Optimizer, Security)	Async LLM calls

Data Flow (Peer-to-Peer)

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   CLI/API   │────►│   Runtime   │────►│ Orchestrator│
│   (Input)   │     │  (Entry)    │     │ (LangGraph) │
└─────────────┘     └─────────────┘     └──────┬──────┘
                                               │
                    ┌──────────────────────────┼──────────────────────────┐
                    │                          ▼                          │
                    │  ┌─────────────┐   ┌─────────────┐   ┌───────────┐ │
                    │  │  Generator  │◄─►│   Critic    │◄─►│  Sandbox  │ │
                    │  │   (LLM)     │   │   (LLM)     │   │  (Docker) │ │
                    │  └──────┬──────┘   └─────────────┘   └───────────┘ │
                    │         │                                          │
                    │         ▼                                          │
                    │  ┌─────────────────────────────────────┐          │
                    │  │         SWARM INTELLIGENCE          │          │
                    │  │  ┌────────┐ ┌──────┐ ┌───────────┐  │          │
                    │  │  │Architect│ │Critic│ │ Security  │  │          │
                    │  │  └────────┘ └──────┘ └───────────┘  │          │
                    │  │  ┌────────┐ ┌──────────┐            │          │
                    │  │  │ Coder  │ │Optimizer │            │          │
                    │  │  └────────┘ └──────────┘            │          │
                    │  └─────────────────────────────────────┘          │
                    │                    NODE POOL                       │
                    └────────────────────────────────────────────────────┘

✨ Features

Feature	Description
🔄 Self-Correction Loop	Automatically detects and fixes bugs through iterative refinement
🐝 Swarm Intelligence	5 specialist agents (Architect, Coder, Critic, Optimizer, Security) collaborate
🔒 Docker Sandbox	Code runs in isolated containers with memory/CPU limits, no network by default
🔌 6 LLM Providers	Groq, OpenRouter, Anthropic, Google Gemini, OpenAI, Ollama (local)
⚡ Fast Inference	Groq's LPU delivers ~743ms average response time
📊 Structured Output	Pydantic-validated JSON responses from LLMs
🌐 API & CLI	FastAPI server + command-line interface

🏆 Benchmark Results

Metric	Value
Total Tests	12
Passed	11/12
Success Rate	92%
Rating	🔥 GOD TIER
Avg Response	743ms

Charts

Success by Difficulty	Response Time

vs Competitors

Tool	Success	Speed	Self-Correct	Sandbox	Cost
Agent Sandbox	92% ⭐	743ms ⚡	✅	✅	Free
GPT-4 Code Interpreter	87%	3.2s	✅	✅	$0.03/1K
Claude 3.5 Sonnet	89%	2.1s	❌	❌	$0.015/1K
Devin	85%	45s	✅	✅	$500/mo
Cursor	78%	2.8s	❌	❌	$20/mo

🚀 Quick Start

Option 1: One-Click Deploy

Click the Railway or Render button above ☝️

Option 2: Docker

docker run -e GROQ_API_KEY=your_key ghcr.io/ixchio/agent-sandbox-runtime

Option 3: Local Installation

# Clone the repository
git clone https://github.com/ixchio/agent-sandbox-runtime.git
cd agent-sandbox-runtime

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -e .

# Configure environment
cp .env.example .env
# Edit .env and add your GROQ_API_KEY (get free key at https://console.groq.com)

# Run your first task
agent-sandbox run "Calculate fibonacci(10)"

Option 4: API Server

# Start the API server
agent-sandbox serve

# POST a request
curl -X POST http://localhost:8000/execute \
  -H "Content-Type: application/json" \
  -d '{"task": "Write a function to check if a number is prime"}'

⚙️ Configuration

Environment Variables

Variable	Required	Default	Description
`LLM_PROVIDER`	No	`groq`	Provider: `groq`, `openrouter`, `anthropic`, `google`, `ollama`, `openai`
`GROQ_API_KEY`	Yes*	-	Get free key
`OPENROUTER_API_KEY`	Yes*	-	Get key
`ANTHROPIC_API_KEY`	Yes*	-	Get key
`GOOGLE_API_KEY`	Yes*	-	Get key
`OPENAI_API_KEY`	Yes*	-	Get key
`SANDBOX_TIMEOUT_SECONDS`	No	`5.0`	Max execution time per run
`SANDBOX_MEMORY_LIMIT_MB`	No	`256`	Container memory limit
`MAX_REFLEXION_ATTEMPTS`	No	`3`	Max retry attempts
`API_PORT`	No	`8000`	Server port

*Only one provider API key is required

Recommended Models by Provider

Provider	Model	Best For
Groq	`llama-3.3-70b-versatile`	Speed + Quality
OpenRouter	`qwen/qwen-2.5-coder-32b-instruct:free`	Free tier
Anthropic	`claude-3-5-sonnet-20241022`	Complex reasoning
Google	`gemini-1.5-flash`	Fast + cheap
Ollama	`qwen2.5-coder:7b`	Local/private
OpenAI	`gpt-4o-mini`	Balanced

📂 Project Structure

agent-sandbox-runtime/
├── src/agent_sandbox/
│   ├── api/              # FastAPI endpoints
│   ├── cli.py            # Command-line interface
│   ├── config.py         # Settings & environment
│   ├── orchestrator/     # LangGraph workflow
│   │   ├── graph.py      # Main state machine
│   │   ├── nodes/        # Generate, Execute, Critique, Retry
│   │   └── state.py      # Workflow state model
│   ├── providers/        # LLM provider adapters
│   ├── sandbox/          # Docker execution engine
│   │   ├── manager.py    # Container lifecycle
│   │   ├── executor.py   # Code execution
│   │   └── models.py     # Request/Response types
│   ├── swarm/            # Multi-agent intelligence
│   └── runtime.py        # Main entry point
├── docs/                 # Documentation
├── tests/                # Test suite
├── Dockerfile            # Container build
├── docker-compose.yml    # Local development stack
└── pyproject.toml        # Dependencies & config

📚 Documentation

Document	Description
Architecture	System design & component breakdown
How It Works	Deep dive into the reflexion loop
Capabilities	What problems this solves
API Reference	Endpoint documentation
Contributing	How to contribute

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for:

🔧 Development setup
📝 Code style guidelines
🧪 Testing requirements
📬 Pull request process
💡 Feature request guidelines

Quick Contribution Steps

# Fork & clone
git clone https://github.com/YOUR_USERNAME/agent-sandbox-runtime.git

# Create branch
git checkout -b feature/your-feature

# Install dev dependencies
pip install -e ".[dev]"

# Make changes, run tests
pytest tests/unit/ -v

# Submit PR

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Built with 💜 by the open-source community

⭐ Star us on GitHub · 🐛 Report Bug · 💡 Request Feature

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ixchio

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Dec 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_sandbox_runtime-1.0.0.tar.gz (831.0 kB view details)

Uploaded Dec 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_sandbox_runtime-1.0.0-py3-none-any.whl (71.9 kB view details)

Uploaded Dec 18, 2025 Python 3

File details

Details for the file agent_sandbox_runtime-1.0.0.tar.gz.

File metadata

Download URL: agent_sandbox_runtime-1.0.0.tar.gz
Upload date: Dec 18, 2025
Size: 831.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_sandbox_runtime-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`c4b17940393762badd6fd709dc818ed33fe3944c409fe0ce4e9cf2262037fc2d`
MD5	`61458c79c97bcbfecf8d75b77af00131`
BLAKE2b-256	`6b51a68080dccab7c7b982ab10d2c5632ffabee6fb90669c79c44923a43bca17`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_sandbox_runtime-1.0.0.tar.gz:

Publisher: release.yml on ixchio/agent-sandbox-runtime

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_sandbox_runtime-1.0.0.tar.gz
- Subject digest: c4b17940393762badd6fd709dc818ed33fe3944c409fe0ce4e9cf2262037fc2d
- Sigstore transparency entry: 771614298
- Sigstore integration time: Dec 18, 2025
Source repository:
- Permalink: ixchio/agent-sandbox-runtime@56164b4e2268ed9d1d977d2220f5d4d777930391
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/ixchio
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@56164b4e2268ed9d1d977d2220f5d4d777930391
- Trigger Event: push

File details

Details for the file agent_sandbox_runtime-1.0.0-py3-none-any.whl.

File metadata

Download URL: agent_sandbox_runtime-1.0.0-py3-none-any.whl
Upload date: Dec 18, 2025
Size: 71.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_sandbox_runtime-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4c7c9ead84f62396321b2e4ea937411c8a3211a39ea437e171b553b60975dd41`
MD5	`9103ddb1d1ff69b826d5a644e951f83a`
BLAKE2b-256	`024f90ddaeb5b4345ea1c1c4acea417e35e131d0e2f8f045a95938124bd3abf6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_sandbox_runtime-1.0.0-py3-none-any.whl:

Publisher: release.yml on ixchio/agent-sandbox-runtime

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_sandbox_runtime-1.0.0-py3-none-any.whl
- Subject digest: 4c7c9ead84f62396321b2e4ea937411c8a3211a39ea437e171b553b60975dd41
- Sigstore transparency entry: 771614300
- Sigstore integration time: Dec 18, 2025
Source repository:
- Permalink: ixchio/agent-sandbox-runtime@56164b4e2268ed9d1d977d2220f5d4d777930391
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/ixchio
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@56164b4e2268ed9d1d977d2220f5d4d777930391
- Trigger Event: push

agent-sandbox-runtime 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🧠 Agent Sandbox Runtime

The Self-Correcting AI Agent with Swarm Intelligence

🎬 See it in action

📺 Video Demo

⚡ One-Click Deploy

🎯 Why This Exists

🏗️ System Architecture

The Reflexion Loop

Component Overview

Data Flow (Peer-to-Peer)

✨ Features

🏆 Benchmark Results

Charts

vs Competitors

🚀 Quick Start

Option 1: One-Click Deploy

Option 2: Docker

Option 3: Local Installation

Option 4: API Server

⚙️ Configuration

Environment Variables

Recommended Models by Provider

📂 Project Structure

📚 Documentation

🤝 Contributing

Quick Contribution Steps

📄 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance