Skip to main content

Open-source framework for Reinforcement Learning integrated with Large Language Models

Project description

RL-LLM Toolkit

Democratizing Reinforcement Learning with Large Language Models

License: MIT Python 3.10+

๐Ÿš€ Overview

RL-LLM Toolkit is an open-source framework that integrates Reinforcement Learning with Large Language Models to create intelligent agents accessible to beginners and researchers. By simulating human feedback via LLMs, we reduce RLHF costs by up to 50% while maintaining training quality.

Key Features

  • ๐ŸŽฎ Gymnasium-Compatible Environments: Easy-to-use RL environments for games, finance, and robotics
  • ๐Ÿค– LLM-Powered Rewards: Generate dense rewards using local or API-based LLMs
  • ๐Ÿ“Š State-of-the-Art Algorithms: PPO, DQN, and more with modular architecture
  • ๐Ÿ”ง Plug-and-Play Design: Swap algorithms, environments, and LLMs effortlessly
  • ๐Ÿ“š Educational Focus: Interactive Jupyter notebooks and comprehensive tutorials
  • ๐ŸŒ Hugging Face Integration: Share models and datasets with the community

๐ŸŽฏ Quick Start

# Install the toolkit
pip install rl-llm-toolkit

# Run a simple example
python -m rl_llm_toolkit.examples.cartpole

๐Ÿ“ฆ Installation

From PyPI (Coming Soon)

pip install rl-llm-toolkit

From Source

git clone https://github.com/tonipcv/hugo.git
cd hugo
pip install -e .

Optional Dependencies

# For LLM integration
pip install -e ".[llm]"

# For development
pip install -e ".[dev]"

# For all features
pip install -e ".[all]"

๐Ÿ’ก Usage Example

from rl_llm_toolkit import RLEnvironment, PPOAgent, LLMRewardShaper
from rl_llm_toolkit.llm import OllamaBackend

# Create environment
env = RLEnvironment("CartPole-v1")

# Set up LLM-based reward shaping
llm = OllamaBackend(model="llama3")
reward_shaper = LLMRewardShaper(llm, prompt_template="custom_template")

# Train agent
agent = PPOAgent(env, reward_shaper=reward_shaper)
agent.train(total_timesteps=100000)

# Evaluate
agent.evaluate(episodes=10, render=True)

๐Ÿ—๏ธ Architecture

rl-llm-toolkit/
โ”œโ”€โ”€ rl_llm_toolkit/          # Core package
โ”‚   โ”œโ”€โ”€ agents/              # RL algorithms (PPO, DQN, etc.)
โ”‚   โ”œโ”€โ”€ environments/        # Custom environments
โ”‚   โ”œโ”€โ”€ llm/                 # LLM integrations
โ”‚   โ”œโ”€โ”€ rewards/             # Reward shaping utilities
โ”‚   โ”œโ”€โ”€ utils/               # Helper functions
โ”‚   โ””โ”€โ”€ cli/                 # Command-line tools
โ”œโ”€โ”€ examples/                # Example scripts and notebooks
โ”œโ”€โ”€ tests/                   # Test suite
โ””โ”€โ”€ docs/                    # Documentation

๐ŸŽ“ Examples

  • CartPole with LLM Feedback: Train a classic control agent with GPT-4 reward shaping
  • Crypto Trading Bot: Build a trading agent using historical data and LLM market analysis
  • Multi-Agent Game: Coordinate multiple agents in a competitive environment

๐Ÿค Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

๐Ÿ“Š Roadmap

Now (0-3 months)

  • โœ… Core RL framework with PPO/DQN
  • โœ… Basic LLM integration (Ollama, OpenAI)
  • ๐Ÿ”„ Interactive examples and tutorials
  • ๐Ÿ”„ Comprehensive documentation

Next (3-6 months)

  • Offline RL support
  • Financial trading environments
  • Hugging Face model hub integration
  • Community leaderboards

Later (6-12 months)

  • Real-time collaboration features
  • Video reasoning integration
  • Advanced multi-agent systems
  • Research partnerships

๐Ÿ“„ License

MIT License - see LICENSE for details.

๐Ÿ™ Acknowledgments

Inspired by projects like PufferLib, Neural MMO, and the broader open-source RL community.

๐Ÿ“ฌ Contact


Star โญ this repo if you find it useful!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rl_llm_toolkit-0.2.0.tar.gz (101.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rl_llm_toolkit-0.2.0-py3-none-any.whl (66.3 kB view details)

Uploaded Python 3

File details

Details for the file rl_llm_toolkit-0.2.0.tar.gz.

File metadata

  • Download URL: rl_llm_toolkit-0.2.0.tar.gz
  • Upload date:
  • Size: 101.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for rl_llm_toolkit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bff53b096b2f1b4df344f57397455ea788364b4eaacdf849933d8964be901c1a
MD5 cdefe9af2d82245d339018e1db3fc50b
BLAKE2b-256 6215b09fa5c021c5221a2ab6277817e6a003539fe004198711a1be48c8d461d4

See more details on using hashes here.

File details

Details for the file rl_llm_toolkit-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rl_llm_toolkit-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 66.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for rl_llm_toolkit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f53ec60604f9024017c0c392712701270ded6a3beec202395ad7917faf45e2b
MD5 3d9caa5e4a35a07c9c1d1814d6ebc9fc
BLAKE2b-256 13d0207d489f2e3b10f48de749a857f41d07455d6827ba8956a01b3123896af3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page