Reinforcement Learning in Pytorch
Project description
Deep Reinforcement Learning in Pytorch
This repository contains all standard model-free and model-based(coming) RL algorithms in Pytorch. (May also contain some research ideas I am working on currently)
What is it?
pytorch-rl implements some state-of-the art deep reinforcement learning algorithms in Pytorch, especially those concerned with continuous action spaces. You can train your algorithm efficiently either on CPU or GPU. Furthermore, pytorch-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy. Of course you can extend pytorch-rl according to your own needs. TL:DR : pytorch-rl makes it really easy to run state-of-the-art deep reinforcement learning algorithms.
Dependencies
- Pytorch
- Gym (OpenAI)
- mujoco-py (For the physics simulation and the robotics env in gym)
- Pybullet (Coming Soon)
- MPI (Only supported with mpi backend Pytorch installation)
- Tensorboardx (https://github.com/lanpa/tensorboardX)
RL algorithms
- DQN (with Double Q learning)
- DDPG
- DDPG with HER (For the OpenAI Fetch Environments)
- Heirarchical Reinforcement Learning
- Prioritized Experience Replay + DDPG
- DDPG with Prioritized Hindsight experience replay (Research)
- Neural Map with A3C (Coming Soon)
- Rainbow DQN (Coming Soon)
- PPO
- HER with self attention for goal substitution (Research)
- A3C (Coming Soon)
- ACER (Coming Soon)
- DARLA
- TDM
- World Models
- Soft Actor-Critic
Environments
- Breakout
- Pong (coming soon)
- Hand Manipulation Robotic Task
- Fetch-Reach Robotic Task
- Hand-Reach Robotic Task
- Block Manipulation Robotic Task
- Montezuma's Revenge (Current Research)
- Pitfall
- Gravitar
- CarRacing
- OpenSim Prosthetics Nips Challenge (https://www.crowdai.org/challenges/nips-2018-ai-for-prosthetics-challenge)
Environment Modelling (For exploration and domain adaptation)
Multiple GAN training tricks have been used because of the instability in training the generators and discriminators. Please refer to https://github.com/soumith/ganhacks for more information.
Even after using the tricks, it was really hard to train a GAN to convergence. However, after using Spectral Normalization (https://arxiv.org/abs/1802.05957) the infogan was trained to convergence.
For image to image translation tasks with GANs and for VAEs in general, training with Skip Connection really helps the training.
- beta-VAE
- InfoGAN
- CVAE-GAN
- Flow based generative models (Research)
- SAGAN
- Sequential Attend, Infer, Repeat
- Curiosity driven exploration
- Parameter Space Noise for Exploration
- Noisy Network
References
- Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
- Human-level control through deep reinforcement learning, Mnih et al., 2015
- Deep Reinforcement Learning with Double Q-learning, van Hasselt et al., 2015
- Continuous control with deep reinforcement learning, Lillicrap et al., 2015
- CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao et al., 2017
- beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework, Higgins et al., 2017
- Hindsight Experience Replay, Andrychowicz et al., 2017
- InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets, Chen et al., 2016
- World Models, Ha et al., 2018
- Spectral Normalization for Generative Adversarial Networks, Miyato et al., 2018
- Self-Attention Generative Adversarial Networks, Zhang et al., 2018
- Curiosity-driven Exploration by Self-supervised Prediction, Pathak et al., 2017
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, Haarnoja et al., 2018
- Parameter Space Noise for Exploration, Plappert et al., 2018
- Noisy Network for Exploration, Fortunato et al., 2018
- Proximal Policy Optimization Algorithms, Schulman et al., 2017
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pytorch-policy-0.1.1.tar.gz
.
File metadata
- Download URL: pytorch-policy-0.1.1.tar.gz
- Upload date:
- Size: 55.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.11.1 setuptools/40.1.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
27c4d108391499375a2bc515920a373724d4bc4052e5e90949dc6b91cdc2350d
|
|
MD5 |
fa3e9456b949023681ca14c448fbd669
|
|
BLAKE2b-256 |
feaf5a1cdd1f2a467b68eaf6e9689042e5aa163ec8992090bc0497059a0f0d81
|
File details
Details for the file pytorch_policy-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: pytorch_policy-0.1.1-py3-none-any.whl
- Upload date:
- Size: 3.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.11.1 setuptools/40.1.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
3ae7ddbcb6bf4ebe15e891dd4847e05ee9ac41c4eda4cb0c85cf35922019ce61
|
|
MD5 |
32cb647c014ce54a6b1c15662b1ec077
|
|
BLAKE2b-256 |
d0cd5715872d9ccd0441f59aab7a1723aee6dfabb395454dc2991e5b7cb65953
|