Skip to main content

RL-Toolkit: A Research Framework for Robotics

Project description

RL Toolkit

Release Tag Issues Commits Languages Size

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

  1. Install dependences
    apt update -y
    apt install swig -y
    
  2. Install RL-Toolkit
    pip3 install rl-toolkit[all]
    
  3. Run (for Server)
    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 server
    
    Run (for Agent)
    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 agent --db_server localhost
    
    Run (for Learner)
    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 learner --db_server 192.168.1.2
    
    Run (for Tester)
    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 tester -f save/model/actor.h5
    

On NVIDIA Jetson

  1. Install dependences
    Tensorflow for JetPack, follow instructions here for installation.

    apt update -y
    apt install swig -y
    
    pip3 install 'tensorflow-probability==0.14.1'
    
  2. Install Reverb
    Download Bazel 3.7.2 for arm64
    GitHub here

    mv ~/Downloads/bazel-3.7.2-linux-arm64 ~/bin/bazel
    chmod +x ~/bin/bazel
    export PATH=$PATH:~/bin
    

    Clone Reverb with version that corespond with TF verion installed on NVIDIA Jetson !

    git clone https://github.com/deepmind/reverb
    cd reverb/
    git checkout r0.5.0   # for TF 2.6.0
    

    Make changes in Reverb before building !
    In .bazelrc

    - build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
    + # build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
    
    - build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
    + build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1"
    
    - build --copt=-mavx --copt=-DEIGEN_MAX_ALIGN_BYTES=64
    + build --copt=-DEIGEN_MAX_ALIGN_BYTES=64
    

    In WORKSPACE

    - PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
    + # PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
    + PROTOC_SHA256 = "7877fee5793c3aafd704e290230de9348d24e8612036f1d784c8863bc790082e"
    

    In oss_build.sh

    -  if [ "$python_version" = "3.7" ]; then
    +  if [ "$python_version" = "3.6" ]; then
    +    export PYTHON_BIN_PATH=/usr/bin/python3.6 && export PYTHON_LIB_PATH=/usr/local/lib/python3.6/dist-packages
    +    ABI=cp36
    +  elif [ "$python_version" = "3.7" ]; then
    
    -  bazel test -c opt --copt=-mavx --config=manylinux2010 --test_output=errors //reverb/cc/...
    +  bazel test -c opt --copt="-march=armv8-a+crypto" --test_output=errors //reverb/cc/...
    
    # Builds Reverb and creates the wheel package.
    -  bazel build -c opt --copt=-mavx $EXTRA_OPT --config=manylinux2010 reverb/pip_package:build_pip_package
    +  bazel build -c opt --copt="-march=armv8-a+crypto" $EXTRA_OPT reverb/pip_package:build_pip_package
    ./bazel-bin/reverb/pip_package/build_pip_package --dst $OUTPUT_DIR $PIP_PKG_EXTRA_ARGS
    

    In reverb/cc/platform/default/repo.bzl

    urls = [
       -        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-x86_64.zip" % (version, version),
       +        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-aarch_64.zip" % (version, version),
    ]
    

    In reverb/pip_package/build_pip_package.sh

    -  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG} --plat manylinux2010_x86_64 > /dev/null
    +  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG}  > /dev/null
    

    Build and install

    bash oss_build.sh --clean true --tf_dep_override "tensorflow=2.6.0" --release --python "3.6"
    bash ./bazel-bin/reverb/pip_package/build_pip_package --dst /tmp/reverb/dist/ --release
    pip3 install /tmp/reverb/dist/dm_reverb-*
    

    Cleaning

    cd ../
    rm -R reverb/      
    
  3. Install RL-Toolkit

    pip3 install rl-toolkit
    
  4. Run (for Server)

    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 server
    

    Run (for Agent)

    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 agent --db_server localhost
    

    Run (for Learner)

    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 learner --db_server 192.168.1.2
    

    Run (for Tester)

    python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 tester -f save/model/actor.h5
    

Environments

Environment Observation space Observation bounds Action space Action bounds
BipedalWalkerHardcore-v3 (24, ) [-inf, inf] (4, ) [-1.0, 1.0]
Walker2DBulletEnv-v0 (22, ) [-inf, inf] (6, ) [-1.0, 1.0]
AntBulletEnv-v0 (28, ) [-inf, inf] (8, ) [-1.0, 1.0]
HalfCheetahBulletEnv-v0 (26, ) [-inf, inf] (6, ) [-1.0, 1.0]
HopperBulletEnv-v0 (15, ) [-inf, inf] (3, ) [-1.0, 1.0]
HumanoidBulletEnv-v0 (44, ) [-inf, inf] (17, ) [-1.0, 1.0]
MinitaurBulletEnv-v0 (28, ) [-167.72488, 167.72488] (8, ) [-1.0, 1.0]

Results

Environment SAC
+ gSDE
SAC
+ gSDE
+ Huber loss
SAC
+ TQC
+ gSDE
SAC
+ TQC
+ gSDE
+ LogCosh
+ Reverb
BipedalWalkerHardcore-v3 13 ± 18(2) - 228 ± 18(2) -
Walker2DBulletEnv-v0 2270 ± 28(1) 2732 ± 96 2535 ± 94(2) -
AntBulletEnv-v0 3106 ± 61(1) 3460 ± 119 3700 ± 37(2) -
HalfCheetahBulletEnv-v0 2945 ± 95(1) 3003 ± 226 3041 ± 157(2) -
HopperBulletEnv-v0 2515 ± 50(1) 2555 ± 405 2401 ± 62(2) -
HumanoidBulletEnv-v0 - - - -
MinitaurBulletEnv-v0 - - - -

results rl-toolkit

Releases

  • SAC + gSDE + Huber loss
      is stored here, branch r2.0
  • SAC + TQC + gSDE + LogCosh + Reverb
      is stored here, branch r4.0

Frameworks: Tensorflow, Reverb, OpenAI Gym, PyBullet, WanDB, OpenCV

Changes

v4.1.1 (September 2, 2022)

  • update default config.yaml

v4.1.0 (February 9, 2022)

Features 🔊

  • .fit()
  • AgentCallback

v4.0.0 (February 5, 2022)

Features 🔊

  • Render environments to WanDB
  • Grouping of runs in WanDB
  • SampleToInsertRatio rate limiter
  • Global Gradient Clipping to avoid exploding gradients
  • Softplus for numerical stability
  • YAML configuration file
  • LogCosh instead of Huber loss
  • Critic network with Add layer applied on state & action branches
  • Custom uniform initializer
  • XLA (Accelerated Linear Algebra) compiler
  • Optimized Replay Buffer (https://github.com/deepmind/reverb/issues/90)
  • split into Agent, Learner, Tester and Server

Bug fixes 🛠️

  • Fixed creating of saving path for models
  • Fixed model's summary()

v3.2.4 (July 7, 2021)

Features 🔊

  • Reverb
  • setup.py (package is available on PyPI)
  • split into Agent, Learner and Tester
  • Use custom model and layer for defining Actor-Critic
  • MultiCritic - concatenating multiple critic networks into one network
  • Truncated Quantile Critics

v2.0.2 (May 23, 2021)

Features 🔊

  • update Dockerfile
  • update README.md
  • formatted code by Black & Flake8

v2.0.1 (April 27, 2021)

Bug fixes 🛠️

  • fixed Critic model

v2.0.0 (April 22, 2021)

Features 🔊

  • Add Huber loss
  • In test mode, rendering to the video file
  • Normalized observation by Min-max method
  • Remove TD3 algorithm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rl-toolkit-4.1.1.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rl_toolkit-4.1.1-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file rl-toolkit-4.1.1.tar.gz.

File metadata

  • Download URL: rl-toolkit-4.1.1.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for rl-toolkit-4.1.1.tar.gz
Algorithm Hash digest
SHA256 d9cf5ce00718a64b729e46e806cc92d0eee8a2433029cde3ca1834b3cc879659
MD5 e3722afcc7414a9d60d182a3e2324a02
BLAKE2b-256 8496c3ee90511a7deed55e1fe7d8b9cc0e81a107a649984662bfe05f6d3d7ca0

See more details on using hashes here.

File details

Details for the file rl_toolkit-4.1.1-py3-none-any.whl.

File metadata

  • Download URL: rl_toolkit-4.1.1-py3-none-any.whl
  • Upload date:
  • Size: 23.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for rl_toolkit-4.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 69d72edd01a2695744a79d2df1eb61d3e1c5da79dabf3a92bac4b23bb8240a79
MD5 57a0bea6a4377dd89ed7c8e936064c01
BLAKE2b-256 f9e06a09ccf4a6f72c64be95f10986fd9a1610eaa5240a7ee6aae6f31e368a35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page