Skip to main content

The RL-Toolkit: A toolkit for developing and comparing your reinforcement learning agents in various games (OpenAI Gym or Pybullet).

Project description

RL toolkit

Release Tag Issues Commits Languages Size

Papers

Setting up container

# Preview
docker pull markub3327/rl-toolkit:latest

# Stable
docker pull markub3327/rl-toolkit:2.0.2

Run

# Training container (learner)
docker run -it --rm markub3327/rl-toolkit python3 training.py [-h] -env ENV_NAME -s PATH_TO_MODEL_FOLDER [--wandb]

# Simulation container (agent)
docker run -it --rm markub3327/rl-toolkit python3 testing.py [-h] -env ENV_NAME -f PATH_TO_MODEL_FOLDER [--wandb]

Tested environments

Environment Observation space Observation bounds Action space Action bounds
BipedalWalkerHardcore-v3 (24, ) [-inf , inf] (4, ) [-1.0 , 1.0]
Walker2DBulletEnv-v0 (22, ) [-inf , inf] (6, ) [-1.0 , 1.0]
AntBulletEnv-v0 (28, ) [-inf , inf] (8, ) [-1.0 , 1.0]
HalfCheetahBulletEnv-v0 (26, ) [-inf , inf] (6, ) [-1.0 , 1.0]
HopperBulletEnv-v0 (15, ) [-inf , inf] (3, ) [-1.0 , 1.0]
HumanoidBulletEnv-v0 (44, ) [-inf , inf] (17, ) [-1.0 , 1.0]

Results

Summary

results

Return from game

Environment gSDE gSDE
+ Huber loss
BipedalWalkerHardcore-v3(2) 13 ± 18 -
Walker2DBulletEnv-v0(1) 2270 ± 28 2732 ± 96
AntBulletEnv-v0(1) 3106 ± 61 3460 ± 119
HalfCheetahBulletEnv-v0(1) 2945 ± 95 3003 ± 226
HopperBulletEnv-v0(1) 2515 ± 50 2555 ± 405
HumanoidBulletEnv-v0 - ** ± **

Frameworks: Tensorflow, Reverb, OpenAI Gym, PyBullet, WanDB, OpenCV
Languages: Python, Shell
Author: Martin Kubovčík

v3.0.7 (June 1, 2021)

Features 🔊

  • Reverb
  • updated kernel_initializer for last layers
  • without clipping the mean
  • setup.py (package is available on PyPI)
  • split research process into agent, learner and tester roles

v2.0.2 (May 23, 2021)

Bug fixes 🛠️

  • update Dockerfile
  • update README.md
  • formatted code by Black & Flake8

v2.0.1 (April 27, 2021)

Bug fixes 🛠️

  • fix Critic model

v2.0.0 (April 22, 2021)

Features 🔊

  • Add Huber loss
  • In test mode, rendering to the video file
  • Normalized observation by Min-max method
  • Remove TD3 algorithm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rl-toolkit-3.1.1.tar.gz (14.3 kB view hashes)

Uploaded Source

Built Distribution

rl_toolkit-3.1.1-py3-none-any.whl (16.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page