Skip to main content

Algorithm and utilities for deep reinforcement learning

Project description

Rainy

Build Status PyPI version

Reinforcement learning utilities and algrithm implementations using PyTorch.

API documentation

COMING SOON

Supported python version

Python >= 3.6.1

Run examples

Though this project is still WIP, all examples are verified to work.

First, install pipenv. E.g. you can install it via

pip3 install pipenv --user

Then, clone this repository and create a virtual environment in it.

git clone https://github.com/kngwyu/Rainy.git
cd Rainy
pipenv --site-packages --three install

Now you are ready to start!

pipenv run python examples/acktr_cart_pole.py train

After training, you can run learned agents.

Please replace (log-directory) in the below command with your real log directory. It should be named like acktr_cart_pole-190219-134651-35a14601.

pipenv run python acktr_cart_pole.py eval (log-directory) --render

You can also plot training results in your log directory. This command opens an ipython shell with your log file.

pipenv run python -m rainy.ipython

Then you can plot training rewards via

log = open_log('log-directory')
log.plot_reward(12 * 20, max_steps=int(4e5), title='ACKTR cart pole')

ACKTR cart pole

Run examples with MPI or NCCL

Distributed training is supported via horovod. I recommend installing it in your site-packages(= not in the virtualenv). E.g., if you want to use it with NCLL, you can install it via

sudo env HOROVOD_GPU_ALLREDUCE=NCCL pip3 install --no-cache-dir horovod

Once you install it, you can run the training script using horovodrun command.

E.g., if you want to use two hosts(localhost and anotherhost) and run ppo_atari.py, use

horovodrun -np 2 -H localhost:1,anotherhost:1 pipenv run python examples/ppo_atari.py train

Override configuration from CLI

Currently, Rainy provides an easy-to-use CLI via click. You can view its usages by, say,

pipenv run python examples/a2c_cart_pole.py --help

This CLI has a simple data-driven interface. I.e., once you fill a config object, then all commands(train, eval, retarain, and etc.) work. So you can start experiments easily without copying and pasting, say, argument parser codes.

However, it has a limitation that you cannot add new options.

So Rainy-CLI provides an option named override, which executes the given string as a Python code with the config object set as config.

Example usage:

pipenv run python examples/a2c_cart_pole.py --override='config.grad_clip=0.5; config.nsteps=10' train

If this feature still doesn't satisfy your requirement, you can override subcommands by ctx.invoke.

Implementation Status

Algorithm Multi Worker(Sync) Recurrent Discrete Action Continuous Action MPI
DQN/Double DQN :x: :x: :heavy_check_mark: :x: :x:
DDPG :x: :x: :x: :heavy_check_mark: :x:
TD3 :x: :x: :x: :heavy_check_mark: :x:
PPO :heavy_check_mark: :heavy_check_mark:(1) :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
A2C :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :x:
ACKTR :heavy_check_mark: :x:(2) :heavy_check_mark: :heavy_check_mark: :x:
AOC :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: :x:

(1): It's very unstable
(2): Needs https://openreview.net/forum?id=HyMTkQZAb implemented

Sub packages

References

DQN (Deep Q Network)

DDQN (Double DQN)

DDPQ(Deep Deterministic Policy Gradient)

TD3(Twin Delayed Deep Deterministic Policy Gradient)

A2C (Advantage Actor Critic)

ACKTR (Actor Critic using Kronecker-Factored Trust Region)

PPO (Proximal Policy Optimization)

AOC (Advantage Option Critic)

Implementaions I referenced

I referenced mainly openai baselines, but all these pacakages were useful.

Thanks!

https://github.com/openai/baselines

https://github.com/ikostrikov/pytorch-a2c-ppo-acktr

https://github.com/ShangtongZhang/DeepRL

https://github.com/chainer/chainerrl

https://github.com/Thrandis/EKFAC-pytorch (for ACKTR)

https://github.com/jeanharb/a2oc_delib (for AOC)

https://github.com/sfujim/TD3 (for DDPG and TD3)

License

This project is licensed under Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rainy-0.4.0.tar.gz (48.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rainy-0.4.0-py3-none-any.whl (81.5 kB view details)

Uploaded Python 3

File details

Details for the file rainy-0.4.0.tar.gz.

File metadata

  • Download URL: rainy-0.4.0.tar.gz
  • Upload date:
  • Size: 48.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.4

File hashes

Hashes for rainy-0.4.0.tar.gz
Algorithm Hash digest
SHA256 2ed888e568dad96712211f0648bd8966923f7527731fc420c766e2f92bcbaec8
MD5 002b4987582c77d9da5d921f063dbe99
BLAKE2b-256 5322c19999575577687fca74000a92b68cf80b66ca7f309db75430a274949cdc

See more details on using hashes here.

File details

Details for the file rainy-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: rainy-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 81.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.4

File hashes

Hashes for rainy-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90fe69b284463ac23a8ae611fe1a5ad80c1404daab9ba090d5cdac0465ded962
MD5 90af222cd09864ff18d07bef54ba6467
BLAKE2b-256 655628ca6c930556931be8cbcc99bce576e967dab0e4aa8cc1d3b94052538170

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page