Skip to main content

Algorithm and utilities for deep reinforcement learning

Project description

Rainy

Build Status PyPI version

Reinforcement learning utilities and algrithm implementations using PyTorch.

API documentation

COMING SOON

Supported python version

Python >= 3.6.1

Run examples

Though this project is still WIP, all examples are verified to work.

First, install pipenv. E.g. you can install it via

pip3 install pipenv --user

Then, clone this repository and create a virtual environment in it.

git clone https://github.com/kngwyu/Rainy.git
cd Rainy
pipenv --site-packages --three install

Now you are ready to start!

pipenv run python examples/acktr_cart_pole.py train

After training, you can run learned agents.

Please replace (log-directory) in the below command with your real log directory. It should be named like acktr_cart_pole-190219-134651-35a14601.

pipenv run python acktr_cart_pole.py eval (log-directory) --render

You can also plot training results in your log directory. This command opens an ipython shell with your log file.

pipenv run python -m rainy.ipython

Then you can plot training rewards via

log = open_log('log-directory')
log.plot_reward(12 * 20, max_steps=int(4e5), title='ACKTR cart pole')

ACKTR cart pole

Run examples with MPI or NCCL

Distributed training is supported via horovod. I recommend installing it in your site-packages(= not in the virtualenv). E.g., if you want to use it with NCLL, you can install it via

sudo env HOROVOD_GPU_ALLREDUCE=NCCL pip3 install --no-cache-dir horovod

Once you install it, you can run the training script using horovodrun command.

E.g., if you want to use two hosts(localhost and anotherhost) and run ppo_atari.py, use

horovodrun -np 2 -H localhost:1,anotherhost:1 pipenv run python examples/ppo_atari.py train

Override configuration from CLI

Currently, Rainy provides an easy-to-use CLI via click. You can view its usages by, say,

pipenv run python examples/a2c_cart_pole.py --help

This CLI has a simple data-driven interface. I.e., once you fill a config object, then all commands(train, eval, retarain, and etc.) work. So you can start experiments easily without copying and pasting, say, argument parser codes.

However, it has a limitation that you cannot add new options.

So Rainy-CLI provides an option named override, which executes the given string as a Python code with the config object set as config.

Example usage:

pipenv run python examples/a2c_cart_pole.py --override='config.grad_clip=0.5; config.nsteps=10' train

If this feature still doesn't satisfy your requirement, you can override subcommands by ctx.invoke.

Implementation Status

Algorithm Multi Worker(Sync) Recurrent Discrete Action Continuous Action MPI
DQN/Double DQN :x: :x: :heavy_check_mark: :x: :x:
PPO :heavy_check_mark: :heavy_check_mark:(1) :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
A2C :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :x:
ACKTR :heavy_check_mark: :x:(2) :heavy_check_mark: :heavy_check_mark: :x:
AOC :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: :x:

(1): It's very unstable
(2): Needs https://openreview.net/forum?id=HyMTkQZAb implemented

Sub packages

References

DQN (Deep Q Network)

DDQN (Double DQN)

A2C (Advantage Actor Critic)

ACKTR (Actor Critic using Kronecker-Factored Trust Region)

PPO (Proximal Policy Optimization)

AOC (Advantage Option Critic)

Implementaions I referenced

I referenced mainly openai baselines, but all these pacakages were useful.

Thanks!

https://github.com/openai/baselines

https://github.com/ikostrikov/pytorch-a2c-ppo-acktr

https://github.com/ShangtongZhang/DeepRL

https://github.com/chainer/chainerrl

https://github.com/Thrandis/EKFAC-pytorch (for ACKTR)

https://github.com/jeanharb/a2oc_delib (for AOC)

License

This project is licensed under Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rainy-0.3.0.tar.gz (43.0 kB view hashes)

Uploaded Source

Built Distribution

rainy-0.3.0-py3-none-any.whl (60.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page