PyTorch Reinforcement Learning Framework for Researchers
Project description
Cherry is a reinforcement learning framework for researchers built on top of PyTorch.
Unlike other reinforcement learning implementations, cherry doesn't implement a single monolithic interface to existing algorithms. Instead, it provides you with low-level, common tools to write your own algorithms. Drawing from the UNIX philosophy, each tool strives to be as independent from the rest of the framework as possible. So if you don't like a specific tool, you don’t need to use it.
Features
- Pythonic and modular interface à la Pytorch.
- Support for tabular (!) and function approximation algorithms.
- Various OpenAI Gym environment wrappers.
- Helper functions for popular algorithms. (e.g. A2C, DDPG, TRPO, PPO, SAC)
- Logging, visualization, and debugging tools.
- Painless and efficient distributed training on CPUs and GPUs.
- Unit, integration, and regression tested, continuously integrated.
Installation
Note Cherry is considered in alpha release. Some stuff might break in the future.
For the latest release of cherry, run the following command in your favorite shell.
pip install cherry-rl
For the cutting edge version, you can run the following commands.
- Clone the repo:
git clone https://github.com/seba-1511/cherry
cd cherry
pip install -e .
Requirements
torch
>= 1.0.0gym
>= 0.15
Development Guidelines
- The
master
branch is always working, considered stable. - The
dev
branch should always work and is ahead ofmaster
, considered cutting edge. - To implement a new functionality: branch
dev
intoyour_name/functionality_name
, implement your functionality, then pull request todev
. It will be periodically merged intomaster
.
Usage
The following snippet demonstrates some of the tools offered by cherry.
import cherry as ch
# Wrapping environments
env = ch.envs.Logger(env, interval=1000) # Prints rollouts statistics
env = ch.envs.Normalized(env, normalize_state=True, normalize_reward=False)
env = ch.envs.Torch(env) # Converts actions/states to tensors
# Storing and retrieving experience
replay = ch.ExperienceReplay()
replay.append(old_state, action, reward, state, done, info = {
'log_prob': mass.log_prob(action), # Can add any variable/tensor to the transitions
'value': value
})
replay.actions # Tensor of all stored actions
replay.states # Tensor of all stored states
replay.empty() # Removes all stored experience
# Discounting and normalizing rewards
rewards = ch.rewards.discount(GAMMA, replay.rewards, replay.dones)
rewards = ch.utils.normalize(rewards)
# Sampling rollouts per episode or samples
num_samples, num_episodes = ch.rollouts.collect(env,
get_action,
replay,
num_episodes=10,
# alternatively: num_samples=1000,
)
Concrete examples are available in the examples/ folder.
Documentation
Documentation and tutorials are available on cherry’s website: http://seba-1511.github.io/cherry.
TODO
Some functionalities that we might want to implement.
- parallelize environments and a way to handle it with
ExperienceReplay
, VisdomLogger
as a dashboard to debug an implementation,- example with reccurent net,
- minimal but complete documentation,
- GPU implementations.
Acknowledgements
Cherry draws inspiration from many reinforcement learning implementations, including
- OpenAI Baselines,
- John Schulman's implementations
- Ilya Kostrikov's implementations,
- Shangtong Zhang's implementations,
- Dave Abel's implementations,
- Vitchyr Pong's implementations,
- RLLab.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.