Add your description here

Project description

🔥 materl - A Declarative RL Library for Fast Experimentation

uv add materl

materl is a Reinforcement Learning library designed for rapid experimentation with language models. It combines a clean, declarative API with an accelerated backend, allowing you to focus on algorithm logic instead of boilerplate.

It's built for researchers who want to test a new reward function, tweak a loss calculation, or implement a novel algorithm quickly and efficiently.

✨ Philosophy: From Idea to Result, Faster

materl is built for iterating quickly. The design is centered on simplicity and performance at the point of experimentation.

Declarative & Functional: Define your entire RL workflow as a series of functional steps. This makes experiments easy to read, modify, and reproduce.
Performant by Default: The library is designed to be fast. Performance-critical sections are handled by an optimized backend, so you get great speed without writing low-level code.
Minimalist API: The API is intentionally simple. Core concepts like Agent, Recipe, and compile are all you need to get started, reducing cognitive overhead.

🏗️ Architecture: Your Experiment as a Graph

The core of materl is its declarative, graph-based paradigm. A "recipe" is a Python function that defines the sequence of operations in your experiment.

Agents: Simple wrappers around your models (e.g., from Hugging Face Transformers).
Recipe: A function that describes the steps: generate text, calculate log-probabilities, compute rewards, and define the loss.
Symbolic Graph: The recipe returns a lightweight data structure that represents your entire workflow.
Compiler: The compile() function processes this graph and prepares it for execution.
Execution: Calling .run() on the compiled graph executes the experiment.

🚀 A Simple DAPO Experiment

This example shows how to set up and run a DAPO experiment. The code reads like a description of the experimental procedure itself.

from materl.agents import Agent
from materl.compiler import compile
from materl.config import GenerationConfig, DAPOConfig
from materl.recipes import dapo
import torch

# 1. Set up your models using the Agent wrapper
model_name = "gpt2"
device = "cuda" if torch.cuda.is_available() else "cpu"
policy_agent = Agent(model_name, trainable=True, device=device)
ref_agent = Agent(model_name, trainable=False, device=device)

# 2. Define your inputs and configurations
prompts = ["Hello, my name is", "What is the capital of France?"]
gen_config = GenerationConfig(max_completion_length=50)
algorithm_config = DAPOConfig(beta=0.1)

# 3. Use a recipe to create a symbolic graph of your experiment
symbolic_graph = dapo(
    policy=policy_agent,
    ref_policy=ref_agent,
    prompts=prompts,
    max_completion_length=gen_config.max_completion_length,
)

# 4. Compile the graph and run the experiment
compiled_graph = compile(symbolic_graph)
final_context = compiled_graph.run(
    policy=policy_agent,
    ref_policy=ref_agent,
    prompts=prompts,
    generation_config=gen_config,
    dapo_config=algorithm_config,
)

print("✅ DAPO experiment finished successfully!")
print(f"Final context keys: {list(final_context.keys())}")

🧪 Included Recipes

materl comes with several pre-built recipes to get you started:

GRPO (Group Relative Policy Optimization)
DAPO (Decoupled Advantage Policy Optimization)
VAPO (Value-Aligned Policy Optimization)

You can find these in materl/recipes and see them in action in the examples/ directory. Creating your own recipe is as simple as writing a new Python function.

🔭 Future Direction

Our goal is to make materl the best tool for applied RL research and fast prototyping. We plan to:

Expand the Recipe Book: Add more state-of-the-art algorithms.
Enhance Debugging Tools: Provide tools to inspect and visualize the computational graph.
Broaden Hardware Support: Continue to optimize performance across a wider range of GPUs.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Ready to accelerate your RL training? Get started with materl today! 🚀

Project details

Release history Release notifications | RSS feed

This version

0.1.1

Jul 7, 2025

0.1.0

May 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

materl-0.1.1.tar.gz (11.5 kB view details)

Uploaded Jul 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

materl-0.1.1-py3-none-any.whl (11.2 kB view details)

Uploaded Jul 7, 2025 Python 3

File details

Details for the file materl-0.1.1.tar.gz.

File metadata

Download URL: materl-0.1.1.tar.gz
Upload date: Jul 7, 2025
Size: 11.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for materl-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`58c6721a15cb5b58b07920f029a783ef2ad81e1759ee278b423bbf8a936bcd84`
MD5	`5431940125c793e45c971756143d0a45`
BLAKE2b-256	`5b44d2319022a2745008544eca24573a76b677f3dabf95b220e3fb3de404d57a`

See more details on using hashes here.

File details

Details for the file materl-0.1.1-py3-none-any.whl.

File metadata

Download URL: materl-0.1.1-py3-none-any.whl
Upload date: Jul 7, 2025
Size: 11.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for materl-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f1bbcd088fb27f6dc2717e624ce8fed6a275d7175959f817e0cfac7729efedcf`
MD5	`2da1b2e31562208da92970563169394a`
BLAKE2b-256	`6e0c41550de8854df6d94df900c11b47441808d75acff2071763099eb30b99b6`

See more details on using hashes here.

materl 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

🔥 materl - A Declarative RL Library for Fast Experimentation

✨ Philosophy: From Idea to Result, Faster

🏗️ Architecture: Your Experiment as a Graph

🚀 A Simple DAPO Experiment

🧪 Included Recipes

🔭 Future Direction

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes