No project description provided

Project description

Synth Environments

Synthetic Environments for Long-Horizon Language Agents

A comprehensive framework for building and managing synthetic environments designed specifically for training and evaluating long-horizon language agents.

🎯 Key Features

🔄 Snapshotting & Reproducibility - Full state capture and replay
🏗️ Statefulness First - Built-in state management across environments
🔌 Consistent APIs - Unified interface for all environment types
📊 Observability - Built-in tracing and monitoring
🌐 HTTP Access - RESTful API for remote training and evaluation
📚 Curriculum Learning - Configurable filtering and progression
🛠️ Agent Tools - Simple abstractions for agent-environment interaction

🚀 Quick Start

Installation

pip install synth-env

Basic Usage

from synth_env import Environment

# Create environment
env = Environment("sokoban")

# Run agent
state = env.reset()
while not env.done:
    action = agent.act(state)
    state = env.step(action)

Running Evaluation Scripts

The framework includes ReAct agent evaluation scripts for testing language models on various environments. These scripts provide comprehensive metrics and shaped rewards for training.

Prerequisites

Start the synth service on port 8901:

# In your service directory
python -m uvicorn main:app --host 0.0.0.0 --port 8901

Ensure your model is available (OpenAI, Anthropic, etc.)

TicTacToe Evaluation

cd Environments
uvpm synth_env.examples.tictactoe.agent_demos.test_tictactoe_react_agent

Features:

Tests strategic gameplay against random opponent
Provides win/loss/draw statistics
Validates coordinate parsing and legal moves
Supports multiple models (gpt-4.1-mini, o3, etc.)

NetHack Evaluation

cd Environments
uvpm synth_env.examples.nethack.agent_demos.test_nethack_react_agent

Features:

Comprehensive dungeon exploration evaluation
26+ shaped reward signals for training
Balrog scoring system integration
Progress bars for multi-trajectory runs
Separates relevant vs. irrelevant metrics

Sokoban Evaluation

cd Environments  
uvpm synth_env.examples.sokoban.agent_demos.test_sokoban_react_agent

Features:

Classic puzzle-solving evaluation
Box-pushing logic validation
Step efficiency analysis
Multiple difficulty levels

Configuration

Edit the script configuration at the top of each file:

MODEL_NAME = "gpt-4.1-mini"  # or "o3", "claude-sonnet-4", etc.
NUM_INSTANCES = 5            # Number of test episodes
MAX_TURNS = 100             # Maximum steps per episode  
DIFFICULTY = "beginner"     # Environment-specific difficulty

All scripts provide detailed rubric results, progress metrics, and shaped rewards suitable for reinforcement learning applications.

Development Setup

# Clone repository
git clone https://github.com/your-org/synth-env.git
cd synth-env

# Install dependencies
uv sync

# Run tests
python dev/update_readme_metrics.py --fast

🎮 Supported Environments

Environment	Status	Description
Sokoban	✅ Stable	Classic puzzle game for planning
Hendryks Math	✅ Stable	Mathematical reasoning tasks
Crafter	✅ Stable	Minecraft-like survival environment
Verilog	🔄 Beta	Hardware description language tasks
Red Team	🚧 Development	Security testing scenarios
SWE-Bench	🚧 Development	Software engineering tasks

📖 Documentation

API Reference - Complete API documentation
Environment Guide - Detailed environment descriptions
Contributing - Development setup and guidelines

🔧 Development

Health Check

# Check codebase health
python scripts/check_health.py

Testing

# Fast tests (~3 seconds)
python dev/update_readme_metrics.py --fast

# Full test suite
python dev/update_readme_metrics.py

Code Quality

# Format code
ruff format .

# Check linting
ruff check .

# Type checking
uvx ty check

Release

# Increment version and publish
python scripts/release.py

# Dry run
python scripts/release.py --dry-run

Pre-Merge Checklist

Before creating a PR, see dev/pr_checklist.md for the complete checklist.

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for:

Development setup
Code style guidelines
Testing requirements
Pull request process

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Special thanks to the research teams at DeepMind, Ragen AI, and other contributors to the environments included in this framework.

⚠️ Development Status: This project is under active development. While stable environments are production-ready, newer environments may have breaking changes.

Project details

Release history Release notifications | RSS feed

0.1.5.dev0 pre-release

Jul 10, 2025

This version

0.1.3.dev4 pre-release

Jul 7, 2025

0.1.3.dev3 pre-release

Jul 7, 2025

0.1.3.dev2 pre-release

Jul 6, 2025

0.1.3.dev1 pre-release

Jul 6, 2025

0.1.3.dev0 pre-release

Jul 6, 2025

0.1.1.dev1 pre-release

Jul 5, 2025

0.1.1.dev0 pre-release

Jul 5, 2025

0.1.0

Jul 5, 2025

0.0.2.dev2 pre-release

Jul 5, 2025

0.0.2.dev1 pre-release

Jul 5, 2025

0.0.1.dev9 pre-release

Jul 5, 2025

0.0.1.dev2 pre-release

Jun 13, 2025

0.0.1.dev1 pre-release

Jun 13, 2025

0.0.1.dev0 pre-release

Jun 13, 2025

0.0.0.dev14 pre-release

May 31, 2025

0.0.0.dev13 pre-release

May 31, 2025

0.0.0.dev12 pre-release

May 30, 2025

0.0.0.dev11 pre-release

May 29, 2025

0.0.0.dev10 pre-release

Jun 12, 2025

0.0.0.dev8 pre-release

May 20, 2025

0.0.0.dev6 pre-release

May 20, 2025

0.0.0.dev5 pre-release

May 20, 2025

0.0.0.dev4 pre-release

May 20, 2025

0.0.0.dev1 pre-release

May 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synth_env-0.1.3.dev4.tar.gz (1.0 MB view details)

Uploaded Jul 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

synth_env-0.1.3.dev4-py3-none-any.whl (1.2 MB view details)

Uploaded Jul 7, 2025 Python 3

File details

Details for the file synth_env-0.1.3.dev4.tar.gz.

File metadata

Download URL: synth_env-0.1.3.dev4.tar.gz
Upload date: Jul 7, 2025
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for synth_env-0.1.3.dev4.tar.gz
Algorithm	Hash digest
SHA256	`89e38656d8d19096513f2d801bff8e89410304f759fc1c41160b35b4c552eedf`
MD5	`78256356d612f42ccf79f4532aae0f41`
BLAKE2b-256	`4f4d20ef3e007023f83658edf2f52d0eda9bb0aa3952e9df1755694cf41a4405`

See more details on using hashes here.

File details

Details for the file synth_env-0.1.3.dev4-py3-none-any.whl.

File metadata

Download URL: synth_env-0.1.3.dev4-py3-none-any.whl
Upload date: Jul 7, 2025
Size: 1.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for synth_env-0.1.3.dev4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`64a3a53afca3694d32f2a0c82ed367fb61d9465129d5b99b4414f5825b1519a8`
MD5	`22b0a06ba6e5e71152979639b76e7745`
BLAKE2b-256	`af00c6e5598e8472989ae2e549003e42a00c295377c48d3e9734a39f5313bbc0`

See more details on using hashes here.

synth-env 0.1.3.dev4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Synth Environments

🎯 Key Features

🚀 Quick Start

Installation

Basic Usage

Running Evaluation Scripts

Prerequisites

TicTacToe Evaluation

NetHack Evaluation

Sokoban Evaluation

Configuration

Development Setup

🎮 Supported Environments

📖 Documentation

🔧 Development

Health Check

Testing

Code Quality

Release

Pre-Merge Checklist

🤝 Contributing

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes