Skip to main content

The RL-Toolkit: A toolkit for developing and comparing your reinforcement learning agents in various games (OpenAI Gym or Pybullet).

Project description

RL toolkit

Release Tag Issues Commits Languages Size

Papers

Setting up container

# Preview
docker pull markub3327/rl-toolkit:latest

# Stable
docker pull markub3327/rl-toolkit:2.0.2

Run

# Training container (learner)
docker run -it --rm markub3327/rl-toolkit python3 training.py [-h] -env ENV_NAME -s PATH_TO_MODEL_FOLDER [--wandb]

# Simulation container (agent)
docker run -it --rm markub3327/rl-toolkit python3 testing.py [-h] -env ENV_NAME -f PATH_TO_MODEL_FOLDER [--wandb]

Tested environments

Environment Observation space Observation bounds Action space Action bounds
BipedalWalkerHardcore-v3 (24, ) [-inf , inf] (4, ) [-1.0 , 1.0]
Walker2DBulletEnv-v0 (22, ) [-inf , inf] (6, ) [-1.0 , 1.0]
AntBulletEnv-v0 (28, ) [-inf , inf] (8, ) [-1.0 , 1.0]
HalfCheetahBulletEnv-v0 (26, ) [-inf , inf] (6, ) [-1.0 , 1.0]
HopperBulletEnv-v0 (15, ) [-inf , inf] (3, ) [-1.0 , 1.0]
HumanoidBulletEnv-v0 (44, ) [-inf , inf] (17, ) [-1.0 , 1.0]

Results

Summary

results

Score

Environment gSDE gSDE
+ Huber loss
BipedalWalkerHardcore-v3(2) 13 ± 18 -
Walker2DBulletEnv-v0(1) 2270 ± 28 2732 ± 96
AntBulletEnv-v0(1) 3106 ± 61 3460 ± 119
HalfCheetahBulletEnv-v0(1) 2945 ± 95 3003 ± 226
HopperBulletEnv-v0(1) 2515 ± 50 2555 ± 405
HumanoidBulletEnv-v0 - ** ± **

Actor

actor

Critic

critic


Frameworks: Tensorflow, Reverb, OpenAI Gym, PyBullet, WanDB, OpenCV

Changes

v3.2.1 (June 6, 2021)

Features 🔊

  • Reverb (+multi-node learning)
  • setup.py (package is available on PyPI)
  • split research process into agent, learner, tester and random roles

v2.0.2 (May 23, 2021)

Bug fixes 🛠️

  • update Dockerfile
  • update README.md
  • formatted code by Black & Flake8

v2.0.1 (April 27, 2021)

Bug fixes 🛠️

  • fix Critic model

v2.0.0 (April 22, 2021)

Features 🔊

  • Add Huber loss
  • In test mode, rendering to the video file
  • Normalized observation by Min-max method
  • Remove TD3 algorithm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rl-toolkit-3.2.3.tar.gz (14.4 kB view hashes)

Uploaded Source

Built Distribution

rl_toolkit-3.2.3-py3-none-any.whl (16.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page