Add your description here
Project description
🔥 materl - A Declarative RL Library for Fast Experimentation
uv add materl
materl is a Reinforcement Learning library designed for rapid experimentation with language models. It combines a clean, declarative API with an accelerated backend, allowing you to focus on algorithm logic instead of boilerplate.
It's built for researchers who want to test a new reward function, tweak a loss calculation, or implement a novel algorithm quickly and efficiently.
✨ Philosophy: From Idea to Result, Faster
materl is built for iterating quickly. The design is centered on simplicity and performance at the point of experimentation.
- Declarative & Functional: Define your entire RL workflow as a series of functional steps. This makes experiments easy to read, modify, and reproduce.
- Performant by Default: The library is designed to be fast. Performance-critical sections are handled by an optimized backend, so you get great speed without writing low-level code.
- Minimalist API: The API is intentionally simple. Core concepts like
Agent,Recipe, andcompileare all you need to get started, reducing cognitive overhead.
🏗️ Architecture: Your Experiment as a Graph
The core of materl is its declarative, graph-based paradigm. A "recipe" is a Python function that defines the sequence of operations in your experiment.
- Agents: Simple wrappers around your models (e.g., from Hugging Face Transformers).
- Recipe: A function that describes the steps: generate text, calculate log-probabilities, compute rewards, and define the loss.
- Symbolic Graph: The recipe returns a lightweight data structure that represents your entire workflow.
- Compiler: The
compile()function processes this graph and prepares it for execution. - Execution: Calling
.run()on the compiled graph executes the experiment.
🚀 A Simple DAPO Experiment
This example shows how to set up and run a DAPO experiment. The code reads like a description of the experimental procedure itself.
from materl.agents import Agent
from materl.compiler import compile
from materl.config import GenerationConfig, DAPOConfig
from materl.recipes import dapo
import torch
# 1. Set up your models using the Agent wrapper
model_name = "gpt2"
device = "cuda" if torch.cuda.is_available() else "cpu"
policy_agent = Agent(model_name, trainable=True, device=device)
ref_agent = Agent(model_name, trainable=False, device=device)
# 2. Define your inputs and configurations
prompts = ["Hello, my name is", "What is the capital of France?"]
gen_config = GenerationConfig(max_completion_length=50)
algorithm_config = DAPOConfig(beta=0.1)
# 3. Use a recipe to create a symbolic graph of your experiment
symbolic_graph = dapo(
policy=policy_agent,
ref_policy=ref_agent,
prompts=prompts,
max_completion_length=gen_config.max_completion_length,
)
# 4. Compile the graph and run the experiment
compiled_graph = compile(symbolic_graph)
final_context = compiled_graph.run(
policy=policy_agent,
ref_policy=ref_agent,
prompts=prompts,
generation_config=gen_config,
dapo_config=algorithm_config,
)
print("✅ DAPO experiment finished successfully!")
print(f"Final context keys: {list(final_context.keys())}")
🧪 Included Recipes
materl comes with several pre-built recipes to get you started:
- GRPO (Group Relative Policy Optimization)
- DAPO (Decoupled Advantage Policy Optimization)
- VAPO (Value-Aligned Policy Optimization)
You can find these in materl/recipes and see them in action in the examples/ directory. Creating your own recipe is as simple as writing a new Python function.
🔭 Future Direction
Our goal is to make materl the best tool for applied RL research and fast prototyping. We plan to:
- Expand the Recipe Book: Add more state-of-the-art algorithms.
- Enhance Debugging Tools: Provide tools to inspect and visualize the computational graph.
- Broaden Hardware Support: Continue to optimize performance across a wider range of GPUs.
📄 License
This project is licensed under the MIT License. See the LICENSE file for details.
Ready to accelerate your RL training? Get started with materl today! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file materl-0.1.1.tar.gz.
File metadata
- Download URL: materl-0.1.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58c6721a15cb5b58b07920f029a783ef2ad81e1759ee278b423bbf8a936bcd84
|
|
| MD5 |
5431940125c793e45c971756143d0a45
|
|
| BLAKE2b-256 |
5b44d2319022a2745008544eca24573a76b677f3dabf95b220e3fb3de404d57a
|
File details
Details for the file materl-0.1.1-py3-none-any.whl.
File metadata
- Download URL: materl-0.1.1-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1bbcd088fb27f6dc2717e624ce8fed6a275d7175959f817e0cfac7729efedcf
|
|
| MD5 |
2da1b2e31562208da92970563169394a
|
|
| BLAKE2b-256 |
6e0c41550de8854df6d94df900c11b47441808d75acff2071763099eb30b99b6
|