DEP-RL, a method for robust control of musculoskeletal systems.

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

Please read the docs.

This repo contains the code for the paper DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems paper, published at ICLR 2023 with perfect review scores (8, 8, 8, 10) and a notable-top-25% rating. See here for videos.

The work was performed by Pierre Schumacher, Daniel F.B. Haeufle, Dieter Büchler, Syn Schmitt and Georg Martius.

If you just want to see the code for DEP, take a look at deprl/dep_controller.py, deprl/custom_agents.py and deprl/env_wrapper/wrappers.py

Big update!

We now provide code for our newest preprint, Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models. With this work, we take a step towards natural movement generation with RL. Take a look at the docs for more information.

Abstract

Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by our finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.

Installation

We provide a python package for easy installation:

pip install deprl

There are more instructions on installation from source, and other things, in the documentation

Environments

The ICLR publication includes experiments with human arms:

pip install git+https://github.com/P-Schumacher/warmup.git

and a bipedal ostrich. The OstrichRL environment can be installed from here.

We also collaborated with groups that provide musculoskeletal control environments and provide additional baselines as well as code from our latest preprints.

Hyfydy

We include several pre-trained baselines and configuration files to train the policies from our newest preprint. These allow you to train walking agents in Hyfydy with RL for natural walking and robust running tasks. We joined forces with Thomas Geijtenbeek @tgeijten to create a python environment interface for Hyfydy. Take a look at sconegym!

MyoLeg

If you are coming here for the MyoLeg, take a look at this tutorial. It will show you how to run the pre-trained baseline. We also explain how to train the walking agent in the MyoSuite documentation.

This repository has been updated with training files that have been used for the MyoSuite baselines, as well as pretrained networks. Simply try training something:

python -m deprl.main experiments/myosuite_training_files/myoChaseTag.json

or render the pretrained baselines with:

python experiments/myosuite_training_files/render_baselines.py

You have to find your own reward function, of course. These files also require the installation of myosuite==2.1.5.

Experiments

The major experiments (humanreacher reaching and ostrich running) can be repeated with the config files. Simply run from the root folder:

python -m deprl.main experiments/ostrich_running_dep.json
python -m deprl.main experiments/humanreacher.json

to train an agent. Model checkpoints will be saved in the output directory. The progress can be monitored with:

python -m tonic.plot --path output/folder/

To execute a trained policy, use:

python -m deprl.play --path output/folder/

See the TonicRL documentation for details.

Be aware that ostrich training can be seed-dependant, as seen in the plots of the original publication.

Pure DEP

If you want to see pure DEP in action, just run the following bash files after installing the ostrichrl and warmup environments.

bash play_files/play_dep_humanreacher.sh
bash play_files/play_dep_ostrich.sh
bash play_files/play_dep_dmcontrol_quadruped.sh

You might see a more interesting ostrich behavior by disabling episode resets in the ostrich environment first.

Environments

The ostrich environment can be found here and is installed automatically via poetry.

The arm-environment warmup is also automatically installed by poetry and can be used like any other gym environment:

import gym
import warmup

env = gym.make("humanreacher-v0")

for ep in range(5):
     ep_steps = 0
     state = env.reset()
     while True:
         next_state, reward, done, info = env.step(env.action_space.sample())
         env.render()
         if done or (ep_steps >= env.max_episode_steps):
             break
         ep_steps += 1

The humanoid environments were simulated with SCONE. A ready-to-use RL package will be released in cooperation with GOATSTREAM at a later date.

Source Code Installation

We recommend an installation with poetry to ensure reproducibility. While TonicRL with PyTorch is used for the RL algorithms, DEP itself is implemented in jax. We strongly recommend GPU-usage to speed up the computation of DEP. On systems without GPUs, give the tensorflow version of TonicRL a try! We also provide a requirements file for pip. Please check the instructions for GPU and CPU versions of torch and jax above.

Pip

Just clone the repository and install locally:

git clone https://github.com/martius-lab/depRL.git
cd depRL
pip install -r requirements.txt
pip install -e ./

Poetry

Make sure to install poetry and deactivate all virtual environments.
Clone the environment

git clone https://github.com/martius-lab/depRL

Go to the root folder and run

poetry install
poetry shell

That's it!

The build has been tested with:

Ubuntu 20.04 and Ubuntu 22.04
CUDA 12.0
poetry 1.4.0

Troubleshooting

A common error with poetry is a faulty interaction with the python keyring, resulting in a Failed to unlock the collection!-error. It could also happen that the dependency solving takes very long (more than 60s), this is caused by the same error. If it happens, try to append

export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring

to your bashrc. You can also try to prepend it to the poetry command: PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring poetry install.

If you have an error related to your ptxas version, this means that your cuda environment is not setup correctly and you should install the cuda-toolkit. The easiest way is to do this via conda if you don't have admin rights on your workstation. I recommend running

conda install -c conda-forge cudatoolkit-dev

In any other case, first try to delete the poetry.lock file and the virtual env .venv, then run poetry install again.

Feel free to open an issue if you encounter any problems.

Citation

If you find this repository useful, please consider giving a star ⭐ and cite our paper by using the following BibTeX entrys.

@inproceedings{schumacher2023:deprl,
  title = {DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems},
  author = {Schumacher, Pierre and Haeufle, Daniel F.B. and B{\"u}chler, Dieter and Schmitt, Syn and Martius, Georg},
  booktitle = {Proceedings of the Eleventh International Conference on Learning Representations (ICLR)},
  month = may,
  year = {2023},
  doi = {},
  url = {https://openreview.net/forum?id=C-xa_D3oTj6},
  month_numeric = {5}
}

@misc{schumacher2023natural,
      title={Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models},
      author={Pierre Schumacher and Thomas Geijtenbeek and Vittorio Caggiano and Vikash Kumar and Syn Schmitt and Georg Martius and Daniel F. B. Haeufle},
      year={2023},
      eprint={2309.02976},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.4.0

Dec 15, 2023

0.3.2

Oct 10, 2023

0.3.1

Sep 1, 2023

0.3.0

Aug 17, 2023

0.3.0.dev0 pre-release

Aug 16, 2023

0.2.0

Aug 3, 2023

0.1.8

Jul 17, 2023

0.1.7.1 yanked

Jul 17, 2023

0.1.7

Jul 17, 2023

0.1.5

Jul 17, 2023

0.1.4

Jul 10, 2023

0.1.3

Jul 3, 2023

0.1.2

Jun 24, 2023

0.1.1

May 31, 2023

0.1.0

May 19, 2023

0.0.9

May 14, 2023

0.0.8

May 14, 2023

0.0.7

May 12, 2023

0.0.6

May 12, 2023

0.0.5

May 11, 2023

0.0.4

Apr 21, 2023

0.0.3

Apr 21, 2023

0.0.1

Apr 21, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deprl-0.4.0.tar.gz (76.8 kB view hashes)

Uploaded Dec 15, 2023 Source

Built Distribution

deprl-0.4.0-py3-none-any.whl (110.8 kB view hashes)

Uploaded Dec 15, 2023 Python 3

Hashes for deprl-0.4.0.tar.gz

Hashes for deprl-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`93b7ae4961fc0a39c3cc0e4b5d6cc8c50e3f52d38ce7e3253cc6e9d5fd245ce3`
MD5	`b5e888a7a862728171715a9ee3f9d480`
BLAKE2b-256	`7916c1d154da301655f749bb90c09ad82369948cfdc95d645c949e60fdd5e4b8`

Hashes for deprl-0.4.0-py3-none-any.whl

Hashes for deprl-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6a21cd1e91c028512b2f2f9e8ca87b6e6067b8ca2c49f788473db2ab9b34a172`
MD5	`8026302c037aa327eaf279397c26180c`
BLAKE2b-256	`7ad17e516f0cbd221908d69a05c66ea7abf27cd85b28ea42e28cee9f2a8d248b`