Skip to main content

Add your description here

Project description

🔥 materl - A Declarative RL Library for Fast Experimentation

License: MIT Python Version

uv add materl

materl is a Reinforcement Learning library designed for rapid experimentation with language models. It combines a clean, declarative API with an accelerated backend, allowing you to focus on algorithm logic instead of boilerplate.

It's built for researchers who want to test a new reward function, tweak a loss calculation, or implement a novel algorithm quickly and efficiently.

✨ Philosophy: From Idea to Result, Faster

materl is built for iterating quickly. The design is centered on simplicity and performance at the point of experimentation.

  • Declarative & Functional: Define your entire RL workflow as a series of functional steps. This makes experiments easy to read, modify, and reproduce.
  • Performant by Default: The library is designed to be fast. Performance-critical sections are handled by an optimized backend, so you get great speed without writing low-level code.
  • Minimalist API: The API is intentionally simple. Core concepts like Agent, Recipe, and compile are all you need to get started, reducing cognitive overhead.

🏗️ Architecture: Your Experiment as a Graph

The core of materl is its declarative, graph-based paradigm. A "recipe" is a Python function that defines the sequence of operations in your experiment.

  1. Agents: Simple wrappers around your models (e.g., from Hugging Face Transformers).
  2. Recipe: A function that describes the steps: generate text, calculate log-probabilities, compute rewards, and define the loss.
  3. Symbolic Graph: The recipe returns a lightweight data structure that represents your entire workflow.
  4. Compiler: The compile() function processes this graph and prepares it for execution.
  5. Execution: Calling .run() on the compiled graph executes the experiment.

🚀 A Simple DAPO Experiment

This example shows how to set up and run a DAPO experiment. The code reads like a description of the experimental procedure itself.

from materl.agents import Agent
from materl.compiler import compile
from materl.config import GenerationConfig, DAPOConfig
from materl.recipes import dapo
import torch

# 1. Set up your models using the Agent wrapper
model_name = "gpt2"
device = "cuda" if torch.cuda.is_available() else "cpu"
policy_agent = Agent(model_name, trainable=True, device=device)
ref_agent = Agent(model_name, trainable=False, device=device)

# 2. Define your inputs and configurations
prompts = ["Hello, my name is", "What is the capital of France?"]
gen_config = GenerationConfig(max_completion_length=50)
algorithm_config = DAPOConfig(beta=0.1)

# 3. Use a recipe to create a symbolic graph of your experiment
symbolic_graph = dapo(
    policy=policy_agent,
    ref_policy=ref_agent,
    prompts=prompts,
    max_completion_length=gen_config.max_completion_length,
)

# 4. Compile the graph and run the experiment
compiled_graph = compile(symbolic_graph)
final_context = compiled_graph.run(
    policy=policy_agent,
    ref_policy=ref_agent,
    prompts=prompts,
    generation_config=gen_config,
    dapo_config=algorithm_config,
)

print("✅ DAPO experiment finished successfully!")
print(f"Final context keys: {list(final_context.keys())}")

🧪 Included Recipes

materl comes with several pre-built recipes to get you started:

  • GRPO (Group Relative Policy Optimization)
  • DAPO (Decoupled Advantage Policy Optimization)
  • VAPO (Value-Aligned Policy Optimization)

You can find these in materl/recipes and see them in action in the examples/ directory. Creating your own recipe is as simple as writing a new Python function.

🔭 Future Direction

Our goal is to make materl the best tool for applied RL research and fast prototyping. We plan to:

  • Expand the Recipe Book: Add more state-of-the-art algorithms.
  • Enhance Debugging Tools: Provide tools to inspect and visualize the computational graph.
  • Broaden Hardware Support: Continue to optimize performance across a wider range of GPUs.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.


Ready to accelerate your RL training? Get started with materl today! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

materl-0.1.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

materl-0.1.1-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file materl-0.1.1.tar.gz.

File metadata

  • Download URL: materl-0.1.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.16

File hashes

Hashes for materl-0.1.1.tar.gz
Algorithm Hash digest
SHA256 58c6721a15cb5b58b07920f029a783ef2ad81e1759ee278b423bbf8a936bcd84
MD5 5431940125c793e45c971756143d0a45
BLAKE2b-256 5b44d2319022a2745008544eca24573a76b677f3dabf95b220e3fb3de404d57a

See more details on using hashes here.

File details

Details for the file materl-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: materl-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.16

File hashes

Hashes for materl-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f1bbcd088fb27f6dc2717e624ce8fed6a275d7175959f817e0cfac7729efedcf
MD5 2da1b2e31562208da92970563169394a
BLAKE2b-256 6e0c41550de8854df6d94df900c11b47441808d75acff2071763099eb30b99b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page