Algorithm and utilities for deep reinforcement learning
Project description
Rainy
Reinforcement learning utilities and algrithm implementations using PyTorch.
API documentation
COMING SOON
Supported python version
Python >= 3.6.1
Implementation Status
Algorithm | Multi Worker(Sync) | Recurrent | Discrete Action | Continuous Action | MPI support |
---|---|---|---|---|---|
DQN/Double DQN | :x: | :x: | :heavy_check_mark: | :x: | :x: |
BootDQN/RPF | :x: | :x: | :heavy_check_mark: | :x: | :x: |
DDPG | :x: | :x: | :x: | :heavy_check_mark: | :x: |
TD3 | :x: | :x: | :x: | :heavy_check_mark: | :x: |
SAC | :x: | :x: | :x: | :heavy_check_mark: | :x: |
PPO | :heavy_check_mark: | :heavy_check_mark:(1) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
A2C | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
ACKTR | :heavy_check_mark: | :x:(2) | :heavy_check_mark: | :heavy_check_mark: | :x: |
AOC | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: |
(1): It's very unstable
(2): Needs https://openreview.net/forum?id=HyMTkQZAb implemented
Sub packages
- intrinsic-rewards
- Contains an implementation of RND(Random Network Distillation)
References
DQN (Deep Q Network)
DDQN (Double DQN)
Bootstrapped DQN
RPF(Randomized Prior Functions)
DDPQ(Deep Deterministic Policy Gradient)
TD3(Twin Delayed Deep Deterministic Policy Gradient)
SAC(Soft Actor Critic)
A2C (Advantage Actor Critic)
- http://proceedings.mlr.press/v48/mniha16.pdf , https://arxiv.org/abs/1602.01783 (A3C, original version)
- https://blog.openai.com/baselines-acktr-a2c/ (A2C, synchronized version)
ACKTR (Actor Critic using Kronecker-Factored Trust Region)
PPO (Proximal Policy Optimization)
AOC (Advantage Option Critic)
- https://arxiv.org/abs/1609.05140 (DQN-like option critic)
- https://arxiv.org/abs/1709.04571 (A3C-like option critic called A2OC)
Implementaions I referenced
Thank you!
https://github.com/openai/baselines
https://github.com/ikostrikov/pytorch-a2c-ppo-acktr
https://github.com/ShangtongZhang/DeepRL
https://github.com/chainer/chainerrl
https://github.com/Thrandis/EKFAC-pytorch (for ACKTR)
https://github.com/jeanharb/a2oc_delib (for AOC)
https://github.com/sfujim/TD3 (for DDPG and TD3)
https://github.com/vitchyr/rlkit (for SAC)
License
This project is licensed under Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.