GPU-Accelerable Multi-Objective Playground
Project description
MO-Playground: Massively Parallelized Multi-Objective Reinforcement Learning for Robotics
Neil Janwani, Ellen Novoseller, Vernon Lawhern, Maegan Tucker,
https://arxiv.org/abs/2603.09237v1
MO-Playground is a collection of multi-objective environments built in JAX for GPU-Accelerated multi-objective RL.
Note that due to double-blind requirements, moplayground's documentation page and pip-installable package is not yet available.
Prerequisites
The code was tested with:
- Ubuntu 22.04
- Python 3.12.12
- CUDA 13.0 (if you want to train policies. Evaluation can happen without a GPU).
Installation
Using the provided ymls, create a new conda environment. If you want to enable GPU based training, run
conda env create -f environment.yml
conda activate moplayground
if you just want to evaluate policies and explore the code (i.e. running on a Mac)
conda env create -f mac_environment.yml
conda activate moplayground
Finally, go to the project root and run
pip3 install -e .
to install the moplayground package.
Evaluation
Create an account at Weights and Biases. Note that the process should be free and you'll be required to paste your API key to get things to work. Educational accounts receive some amount of free storage by the way, which can be useful if you're a student!
Finally, pick an environment from the below list
cheetahhopperwalkeranthumanoidbruce
and download your desired policy.
python3 -m scripts.download_model --env cheetah
note that you can supply a desired save directory via --save_dir.
The default directory is simply results/wandb-downloads.
Finally, you can run the policy via
python3 -m scripts.rollout_policy config_path
where config_path is the config.yaml file where your model was saved.
It will be at save_dir/env_name/config.yaml, where save_dir and env_name are defined above.
Training
To train a pre-existing environment, check out the configuration files in config/.
These files specify everything from model architecture and MORLAX parameters to reward and environment constants.
Choose the config file you want, edit the parameters to your liking, and run
python3 -m scripts.train config_path
where config_path is the path to the config of your choice.
If you downloaded a policy in the past, you can also use those configs to run an identical training run on your system.
Creating your own environment
To create a custom environment, check out how the cheeah environment works at src/moplayground/envs/dmcontrol/cheetah.py.
You should need to make your child class a member of the MultiObjectiveBase class. You will also need to make a config.yaml
file for your environment to specify the training parameters.
Note that support for custom dynamics (i.e. non-mujoco) is coming soon.
Classic Environments
| Environment | Reward 1 | Reward 2 |
|---|---|---|
BRUCE Robotics Example
MO-Playground is demonstrated for the BRUCE humanoid robot, developed by Westwood Robotics.
The application features seven possible reward functions. Note that we combine base_xyz_tracking and base_quat_tracking to explore a 6-dimensional objective space.
| Reward Name | Description |
|---|---|
gait_tracking |
Track the reference joint-level trajectory |
base_xyz_tracking |
Track the base position associated with the reference trajectory |
base_quat_tracking |
Track the base orientation associated with the reference trajectory |
arm_swinging |
Maximize the amount of arm-swing |
arm_static |
Minimize the amount of arm-swing |
minimize_energy |
Minimize energy consumption |
Examples of Multi-Objective Policies
| Policy | Result |
|---|---|
| Balanced Reward | |
| Max Imitation | |
| Max Arm Swinging | |
| Max Smoothness |
Citation
@article{janwani2026mo,
title={MO-Playground: Massively Parallelized Multi-Objective Reinforcement Learning for Robotics},
author={Janwani, Neil and Novoseller, Ellen and Lawhern, Vernon J and Tucker, Maegan},
journal={arXiv preprint arXiv:2603.09237},
year={2026}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file moplayground-0.1.3.tar.gz.
File metadata
- Download URL: moplayground-0.1.3.tar.gz
- Upload date:
- Size: 59.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebd4b2255349933cfe8b891cde32988bb0b4036c1d6be9636a0baff49502bff6
|
|
| MD5 |
456268b8913512127e1de3a0c6739249
|
|
| BLAKE2b-256 |
9533372494a5e62197e223267f7ed5c67eec8bea3b8c51facd9caeda2a6b76b8
|
File details
Details for the file moplayground-0.1.3-py3-none-any.whl.
File metadata
- Download URL: moplayground-0.1.3-py3-none-any.whl
- Upload date:
- Size: 72.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84faed7e9e53b10c5dbedadccddf6ed8d92cd9bd551a744975b7c096a1cae6ce
|
|
| MD5 |
32cc09927c71070de18818072de0bc62
|
|
| BLAKE2b-256 |
fad77a2900a901393a25766b55648210e7c5c3b535d92b09cc111d88f9707626
|