A simple framework for distributed reinforcement learning in PyTorch.
Project description
rltorch(WIP)
rltorch provides a simple framework for reinforcement learning in PyTorch. You can easily implement distributed RL algorithms.
Installation
Install rltorch from source.
git clone https://github.com/ku2482/rltorch.git
cd rltorch
pip install -e .
You can also install using PyPI.
pip install rltorch
Examples
Ape-X
You can implement Ape-X[1] agent like this example here.
python examples/atari/apex.py \
[--env_id str(default MsPacmanNoFrameskip-v4)] \
[--num_actors int(default 4)] [--cuda (optional)] \
[--seed int(default 0)]
Soft Actor-Critic
You can implement Soft Actor-Critic[2, 3] agent like this example here. Note that you need a license and mujoco_py to be installed.
python examples/mujoco/sac.py \
[--env_id str(default HalfCheetah-v2)] \
[--num_actors int(default 1)] \
[--cuda (optional)] [--seed int(default 0)]
SAC-Discrete
You can implement SAC-Discrete[4] agent like this example here.
python examples/mujoco/sac.py \
[--env_id str(default MsPacmanNoFrameskip-v4)] \
[--num_actors int(default 4)] \
[--cuda (optional)] [--seed int(default 0)]
References
[1] Horgan, Dan, et al. "Distributed prioritized experience replay." arXiv preprint arXiv:1803.00933 (2018).
[2] Haarnoja, Tuomas, et al. "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor." arXiv preprint arXiv:1801.01290 (2018).
[3] Haarnoja, Tuomas, et al. "Soft actor-critic algorithms and applications." arXiv preprint arXiv:1812.05905 (2018).
[4] Christodoulou, Petros. "Soft Actor-Critic for Discrete Action Settings." arXiv preprint arXiv:1910.07207 (2019).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rltorch-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13cf0b130d060c36e090a30a95ca86021db1bb007cff94bc76a45df0f838d616 |
|
MD5 | 60109ce86f560f895e21e51c8a4e142e |
|
BLAKE2b-256 | b2093f758be9a10a023e3fc5ff983f8b50eb21fee8b4b9c0a74fe6c45780c579 |