Skip to main content

RL-Toolkit: A Research Framework for Robotics

Project description

RL Toolkit

Release Tag Issues Commits Languages Size

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

  1. Install dependences
    apt update -y
    apt install swig -y
    
  2. Install RL-Toolkit
    pip3 install rl-toolkit[all]
    
  3. Run (for Server)
    rl_toolkit rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 server
    
    Run (for Agent)
    rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 agent
    
    Run (for Learner)
    rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 learner --db_server 192.168.1.2
    
    Run (for Tester)
    rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 tester -f save/model/actor.h5
    

On NVIDIA Jetson

  1. Install dependences
    Tensorflow for JetPack, follow instructions here for installation.

    sudo apt install swig -y
    
  2. Install Reverb
    Download Bazel 3.7.2 for arm64, here

    mkdir ~/bin
    mv ~/Downloads/bazel-3.7.2-linux-arm64 ~/bin/bazel
    chmod +x ~/bin/bazel
    export PATH=$PATH:~/bin
    

    Clone Reverb with version that corespond with TF verion installed on NVIDIA Jetson !

    git clone https://github.com/deepmind/reverb
    cd reverb/
    git checkout r0.9.0
    

    Make changes in Reverb before building !
    In .bazelrc

    - build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
    + # build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
    
    - build --copt=-mavx --copt=-DEIGEN_MAX_ALIGN_BYTES=64
    + build --copt=-DEIGEN_MAX_ALIGN_BYTES=64
    

    In WORKSPACE

    - PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
    + # PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
    + PROTOC_SHA256 = "7877fee5793c3aafd704e290230de9348d24e8612036f1d784c8863bc790082e"
    

    In oss_build.sh

    -  bazel test -c opt --copt=-mavx --config=manylinux2010 --test_output=errors //reverb/cc/...
    +  bazel test -c opt --copt="-march=armv8-a+crypto" --test_output=errors //reverb/cc/...
    
    # Builds Reverb and creates the wheel package.
    -  bazel build -c opt --copt=-mavx $EXTRA_OPT --config=manylinux2010 reverb/pip_package:build_pip_package
    +  bazel build -c opt --copt="-march=armv8-a+crypto" $EXTRA_OPT reverb/pip_package:build_pip_package
    

    In reverb/cc/platform/default/repo.bzl

    urls = [
       -        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-x86_64.zip" % (version, version),
       +        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-aarch_64.zip" % (version, version),
    ]
    

    In reverb/pip_package/build_pip_package.sh

    -  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG} --plat manylinux2010_x86_64 > /dev/null
    +  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG}  > /dev/null
    

    Build and install

    bash oss_build.sh --clean true --tf_dep_override "tensorflow~=2.9.1" --release --python "3.8"
    bash ./bazel-bin/reverb/pip_package/build_pip_package --dst /tmp/reverb/dist/ --release
    pip3 install /tmp/reverb/dist/dm_reverb-*
    

    Cleaning

    cd ../
    rm -R reverb/
    
  3. Install RL-Toolkit

    pip3 install rl-toolkit
    

Environments

Environment Observation space Observation bounds Action space Action bounds Reward bounds
BipedalWalkerHardcore-v3 (24, ) [-inf, inf] (4, ) [-1.0, 1.0] [-1.0, 1.0]
FlappyBird-v0 (16, 180) [0, dmax] (2, ) {DO NOTHING, FLAP} [-1.0, 1.0]

Results

Environment SAC
+ gSDE
SAC
+ gSDE
+ Huber loss
SAC
+ TQC
+ gSDE
Q-Learning RL-Toolkit
BipedalWalkerHardcore-v3 13 ± 18(1) 239 ± 118 228 ± 18(1) - 205 ± 134
FlappyBird-v0 - - - 209.298(2) 13 156

dm_ant_ball_sac

Releases


Frameworks: Tensorflow, DeepMind Reverb, Gymnasium, DeepMind Control Suite, WanDB, OpenCV

RL Toolkit

Release Tag Issues Commits Languages Size

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

  1. Install dependences
    apt update -y
    apt install swig -y
    
  2. Install RL-Toolkit
    pip3 install rl-toolkit[all]
    
  3. Run (for Server)
    rl_toolkit rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 server
    
    Run (for Agent)
    rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 agent
    
    Run (for Learner)
    rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 learner --db_server 192.168.1.2
    
    Run (for Tester)
    rl_toolkit -c ./config/sac.yaml -a sac -e BipedalWalkerHardcore-v3 tester -f save/model/actor.h5
    

On NVIDIA Jetson

  1. Install dependences
    Tensorflow for JetPack, follow instructions here for installation.

    sudo apt install swig -y
    
  2. Install Reverb
    Download Bazel 3.7.2 for arm64, here

    mkdir ~/bin
    mv ~/Downloads/bazel-3.7.2-linux-arm64 ~/bin/bazel
    chmod +x ~/bin/bazel
    export PATH=$PATH:~/bin
    

    Clone Reverb with version that corespond with TF verion installed on NVIDIA Jetson !

    git clone https://github.com/deepmind/reverb
    cd reverb/
    git checkout r0.9.0
    

    Make changes in Reverb before building !
    In .bazelrc

    - build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
    + # build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
    
    - build --copt=-mavx --copt=-DEIGEN_MAX_ALIGN_BYTES=64
    + build --copt=-DEIGEN_MAX_ALIGN_BYTES=64
    

    In WORKSPACE

    - PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
    + # PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
    + PROTOC_SHA256 = "7877fee5793c3aafd704e290230de9348d24e8612036f1d784c8863bc790082e"
    

    In oss_build.sh

    -  bazel test -c opt --copt=-mavx --config=manylinux2010 --test_output=errors //reverb/cc/...
    +  bazel test -c opt --copt="-march=armv8-a+crypto" --test_output=errors //reverb/cc/...
    
    # Builds Reverb and creates the wheel package.
    -  bazel build -c opt --copt=-mavx $EXTRA_OPT --config=manylinux2010 reverb/pip_package:build_pip_package
    +  bazel build -c opt --copt="-march=armv8-a+crypto" $EXTRA_OPT reverb/pip_package:build_pip_package
    

    In reverb/cc/platform/default/repo.bzl

    urls = [
       -        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-x86_64.zip" % (version, version),
       +        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-aarch_64.zip" % (version, version),
    ]
    

    In reverb/pip_package/build_pip_package.sh

    -  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG} --plat manylinux2010_x86_64 > /dev/null
    +  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG}  > /dev/null
    

    Build and install

    bash oss_build.sh --clean true --tf_dep_override "tensorflow~=2.9.1" --release --python "3.8"
    bash ./bazel-bin/reverb/pip_package/build_pip_package --dst /tmp/reverb/dist/ --release
    pip3 install /tmp/reverb/dist/dm_reverb-*
    

    Cleaning

    cd ../
    rm -R reverb/
    
  3. Install RL-Toolkit

    pip3 install rl-toolkit
    

Environments

Environment Observation space Observation bounds Action space Action bounds Reward bounds
BipedalWalkerHardcore-v3 (24, ) [-inf, inf] (4, ) [-1.0, 1.0] [-1.0, 1.0]
FlappyBird-v0 (16, 180) [0, dmax] (2, ) {DO NOTHING, FLAP} [-1.0, 1.0]

Results

Environment SAC
+ gSDE
SAC
+ gSDE
+ Huber loss
SAC
+ TQC
+ gSDE
Q-Learning RL-Toolkit
BipedalWalkerHardcore-v3 13 ± 18(1) 239 ± 118 228 ± 18(1) - 205 ± 134
FlappyBird-v0 - - - 209.298(2) 13 156

dm_ant_ball_sac

Releases


Frameworks: Tensorflow, DeepMind Reverb, Gymnasium, DeepMind Control Suite, WanDB, OpenCV

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rl-toolkit-5.0.0.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rl_toolkit-5.0.0-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file rl-toolkit-5.0.0.tar.gz.

File metadata

  • Download URL: rl-toolkit-5.0.0.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for rl-toolkit-5.0.0.tar.gz
Algorithm Hash digest
SHA256 8d64faf5ebdf5bcbdb8fcd76c6b9a6c9e6b2cf33801c73bf69d0e6b07a39de5d
MD5 8ae5d6577f0a99a04a9ca3996e47eedc
BLAKE2b-256 9e9fed0d42035d06eca2c09eca4e5079f1019ddaa4735eebc356c9eb9b87e895

See more details on using hashes here.

File details

Details for the file rl_toolkit-5.0.0-py3-none-any.whl.

File metadata

  • Download URL: rl_toolkit-5.0.0-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for rl_toolkit-5.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1544438396d539ee0d101eeed6dc6b5fd1a4d94e930d429f1394f2664530df63
MD5 99d63181e88353c378892f3109bfe182
BLAKE2b-256 0da8c98dfc5d9163873690a312923d2e5f6bd8797d2a54983b2b39b506185f2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page