Skip to main content

A Python library for reinforcement learning algorithms

Project description

🎯 NeatRL

A clean, modern Python library for reinforcement learning algorithms

NeatRL provides high-quality implementations of popular RL algorithms with a focus on simplicity, performance, and ease of use. Built with PyTorch and designed for both research and production use.

✨ Features

  • 📊 Experiment Tracking: Built-in support for Weights & Biases logging
  • 🎮 Gymnasium Compatible: Works with Gymnasium environments and adding many more!
  • 🎯 Atari Support: Full support for Atari games with automatic CNN architectures
  • Parallel Training: Vectorized environments for faster data collection
  • 🔧 Easy to Extend: Modular design for adding new algorithms
  • 📈 State-of-the-Art: Implements modern RL techniques and best practices
  • 🎥 Video Recording: Automatic video capture and WandB integration
  • 📉 Advanced Logging: Per-layer gradient monitoring and comprehensive metrics

🏗️ Supported Algorithms

Current Implementations

  • DQN (Deep Q-Network) - Classic value-based RL algorithm

    • Support for discrete action spaces
    • Experience replay and target networks
    • Atari preprocessing and frame stacking
  • Dueling DQN - Enhanced DQN with separate value and advantage streams

    • Improved learning stability
    • Better performance on complex environments
  • REINFORCE - Policy gradient method for discrete and continuous action spaces

    • Atari game support with automatic CNN architecture
    • Parallel environment training (n_envs support)
    • Continuous action space support
    • Episode-based Monte Carlo returns
    • Variance reduction through baseline subtraction
  • DDPG (Deep Deterministic Policy Gradient) - Actor-critic method for continuous action spaces

    • Deterministic policy gradient for continuous control
    • Experience replay and target networks
    • Ornstein-Uhlenbeck noise for exploration
    • Support for exact continuous action spaces
  • A2C (Advantage Actor-Critic) - Synchronous actor-critic algorithm

    • Synchronous version of A3C for stable training
    • Advantage function for reduced variance
    • Support for both discrete and continuous action spaces
    • Parallel environment training with vectorized environments
    • Monte Carlo returns for value estimation
  • PPO (Proximal Policy Optimization) - State-of-the-art policy gradient method with GAE

    • Full PPO implementation with Generalized Advantage Estimation (GAE)
    • Support for both discrete and continuous action spaces
    • Atari game support with automatic CNN architecture
    • Clipped surrogate objective for stable policy updates
    • Value function clipping and entropy regularization
    • Vectorized environments for parallel training
  • PPO-RND (Proximal Policy Optimization with Random Network Distillation) - State-of-the-art exploration method

    • Intrinsic motivation through novelty detection
    • Combined extrinsic and intrinsic rewards for better exploration
    • Support for both discrete and continuous action spaces
    • PPO with clipped surrogate objective
    • Vectorized environments for parallel training
    • Intrinsic reward normalization and advantage calculation
  • More algorithms coming soon...

📦 Installation

python -m venv neatrl-env
source neatrl-env/bin/activate 

pip install neatrl"[classic,box2d,atari]"

🚀 Quick Start

Train DQN on CartPole

from neatrl import train_dqn

model = train_dqn(
    env_id="CartPole-v1",
    total_timesteps=10000,
    seed=42
)

Train PPO on Classic Control

from neatrl import train_ppo

model = train_ppo(
    env_id="CartPole-v1",
    total_timesteps=50000,
    n_envs=4,           # Parallel environments
    GAE=0.95,           # Generalized Advantage Estimation lambda
    clip_value=0.2,     # PPO clipping parameter
    use_wandb=True,     # Track with WandB
    seed=42
)

Train PPO on Atari

from neatrl import train_ppo_cnn

model = train_ppo_cnn(
    env_id="BreakoutNoFrameskip-v4",
    total_timesteps=100000,
    n_envs=8,           # More parallel environments for Atari
    atari_wrapper=True, # Automatic Atari preprocessing
    use_wandb=True,     # Track with WandB
    seed=42
)

📚 Documentation

📖 Complete Documentation

The docs include:

  • Detailed usage examples
  • Hyperparameter tuning guides
  • Environment compatibility
  • Experiment tracking setup
  • Troubleshooting tips

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

git clone https://github.com/YuvrajSingh-mist/NeatRL.git
cd NeatRL
pip install -e .[dev]

For the complete changelog, see CHANGELOG.md.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ for the RL community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neatrl-0.6.0.tar.gz (48.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neatrl-0.6.0-py3-none-any.whl (57.1 kB view details)

Uploaded Python 3

File details

Details for the file neatrl-0.6.0.tar.gz.

File metadata

  • Download URL: neatrl-0.6.0.tar.gz
  • Upload date:
  • Size: 48.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for neatrl-0.6.0.tar.gz
Algorithm Hash digest
SHA256 afa71a565295308ae07a2580ea9ff980ec634ccd307f6324a2103b36d6550090
MD5 c87565c3482fc07e3139a6d39c519ce3
BLAKE2b-256 c3600d08887b48d7d49173e45777252ef36bf1456765cc2a5a3164e42f1dddd5

See more details on using hashes here.

File details

Details for the file neatrl-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: neatrl-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 57.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for neatrl-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f18b6be2fcd60f3b0031ba5117b4da51c652d83cdb4ab0bf83bb27c9ba10732d
MD5 2d6a523167c3538c5d8d2ceec98624ff
BLAKE2b-256 136e2d10a4932c3529f708faa37d29e5bbf6149a0462ccea85806f96f731c88b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page