Horizons AI - Advanced Reinforcement Learning Environments

Project description

Horizons AI

Advanced Reinforcement Learning Environments

A comprehensive framework for building and managing synthetic environments designed specifically for training and evaluating long-horizon language agents.

🎯 Key Features

🔄 Snapshotting & Reproducibility - Full state capture and replay
🏗️ Statefulness First - Built-in state management across environments
🔌 Consistent APIs - Unified interface for all environment types
📊 Observability - Built-in tracing and monitoring
🌐 HTTP Access - RESTful API for remote training and evaluation
📚 Curriculum Learning - Configurable filtering and progression
🛠️ Agent Tools - Simple abstractions for agent-environment interaction

🚀 Quick Start

Installation

pip install horizons-ai

Basic Usage

import horizons

# Create environment
env = horizons.Environment("sokoban")

# Run agent
state = env.reset()
while not env.done:
    action = agent.act(state)
    state = env.step(action)

Running Evaluation Scripts

The framework includes ReAct agent evaluation scripts for testing language models on various environments. These scripts provide comprehensive metrics and shaped rewards for training.

Prerequisites

Start the synth service on port 8901:

# In your service directory
python -m uvicorn main:app --host 0.0.0.0 --port 8901

Ensure your model is available (OpenAI, Anthropic, etc.)

TicTacToe Evaluation

cd Environments
uvpm synth_env.examples.tictactoe.agent_demos.test_tictactoe_react_agent

Features:

Tests strategic gameplay against random opponent
Provides win/loss/draw statistics
Validates coordinate parsing and legal moves
Supports multiple models (gpt-4.1-mini, o3, etc.)

NetHack Evaluation

cd Environments
uvpm synth_env.examples.nethack.agent_demos.test_nethack_react_agent

Features:

Comprehensive dungeon exploration evaluation
26+ shaped reward signals for training
Balrog scoring system integration
Progress bars for multi-trajectory runs
Separates relevant vs. irrelevant metrics

Sokoban Evaluation

cd Environments  
uvpm synth_env.examples.sokoban.agent_demos.test_sokoban_react_agent

Features:

Classic puzzle-solving evaluation
Box-pushing logic validation
Step efficiency analysis
Multiple difficulty levels

Configuration

Edit the script configuration at the top of each file:

MODEL_NAME = "gpt-4.1-mini"  # or "o3", "claude-sonnet-4", etc.
NUM_INSTANCES = 5            # Number of test episodes
MAX_TURNS = 100             # Maximum steps per episode  
DIFFICULTY = "beginner"     # Environment-specific difficulty

All scripts provide detailed rubric results, progress metrics, and shaped rewards suitable for reinforcement learning applications.

Development Setup

# Clone repository
git clone https://github.com/your-org/synth-env.git
cd synth-env

# Install dependencies
uv sync

# Run tests
python dev/update_readme_metrics.py --fast

🎮 Supported Environments

Environment	Status	Description
Sokoban	✅ Stable	Classic puzzle game for planning
Hendryks Math	✅ Stable	Mathematical reasoning tasks
Crafter	✅ Stable	Minecraft-like survival environment
Verilog	🔄 Beta	Hardware description language tasks
Red Team	🚧 Development	Security testing scenarios
SWE-Bench	🚧 Development	Software engineering tasks

📖 Documentation

API Reference - Complete API documentation
Environment Guide - Detailed environment descriptions
Contributing - Development setup and guidelines

🔧 Development

Health Check

# Check codebase health
python scripts/check_health.py

Testing

# Fast tests (~3 seconds)
python dev/update_readme_metrics.py --fast

# Full test suite
python dev/update_readme_metrics.py

Code Quality

# Format code
ruff format .

# Check linting
ruff check .

# Type checking
uvx ty check

Release

# Increment version and publish
python scripts/release.py

# Dry run
python scripts/release.py --dry-run

Pre-Merge Checklist

Before creating a PR, see dev/pr_checklist.md for the complete checklist.

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for:

Development setup
Code style guidelines
Testing requirements
Pull request process

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Special thanks to the research teams at DeepMind, Ragen AI, and other contributors to the environments included in this framework.

⚠️ Development Status: This project is under active development. While stable environments are production-ready, newer environments may have breaking changes.

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Aug 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

horizons_ai-0.1.0.tar.gz (4.4 MB view details)

Uploaded Aug 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

horizons_ai-0.1.0-py3-none-any.whl (4.8 kB view details)

Uploaded Aug 20, 2025 Python 3

File details

Details for the file horizons_ai-0.1.0.tar.gz.

File metadata

Download URL: horizons_ai-0.1.0.tar.gz
Upload date: Aug 20, 2025
Size: 4.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for horizons_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`3c77a18d67b08bb6ae8b41a42726c5dd046b1f5af32b0fc37ac56a594a2e3860`
MD5	`3fb631d550efa8764a30d2f75f7587af`
BLAKE2b-256	`f9befbd7b0eb45bdcf182aa9f53dcef7641d28b51b573e31f0ab6b06c7542b75`

See more details on using hashes here.

File details

Details for the file horizons_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: horizons_ai-0.1.0-py3-none-any.whl
Upload date: Aug 20, 2025
Size: 4.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for horizons_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8beaaef5fd32af92a873a9d1fbfdd3c6c5d6be7c18a822d67948dcd930fff05a`
MD5	`daa662c497a4a92dfea25a6a09c01c4f`
BLAKE2b-256	`a211915f73d82bd16de97f99900c3f60d1a913b1536edba97a4ec3d26f496813`

See more details on using hashes here.

horizons-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Horizons AI

🎯 Key Features

🚀 Quick Start

Installation

Basic Usage

Running Evaluation Scripts

Prerequisites

TicTacToe Evaluation

NetHack Evaluation

Sokoban Evaluation

Configuration

Development Setup

🎮 Supported Environments

📖 Documentation

🔧 Development

Health Check

Testing

Code Quality

Release

Pre-Merge Checklist

🤝 Contributing

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes