RL-Toolkit: A Research Framework for Robotics

These details have not been verified by PyPI

Project links

Project description

RL Toolkit

Tag Commits Languages Size

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

Install dependences
```
apt update -y
apt install swig -y
```
Install RL-Toolkit
```
pip3 install rl-toolkit[all]
```

Run (for Server)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 server

Run (for Agent)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 agent --db_server localhost

Run (for Learner)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 learner --db_server 192.168.1.2

Run (for Tester)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 tester -f save/model/actor.h5

On NVIDIA Jetson

Install dependences
Tensorflow for JetPack, follow instructions here for installation.

apt update -y
apt install swig -y

pip3 install 'tensorflow-probability==0.14.1'

Install Reverb
Download Bazel 3.7.2 for arm64
GitHub here

mv ~/Downloads/bazel-3.7.2-linux-arm64 ~/bin/bazel
chmod +x ~/bin/bazel
export PATH=$PATH:~/bin

Clone Reverb with version that corespond with TF verion installed on NVIDIA Jetson !

git clone https://github.com/deepmind/reverb
cd reverb/
git checkout r0.5.0   # for TF 2.6.0

Make changes in Reverb before building !
In .bazelrc

- build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain
+ # build:manylinux2010 --crosstool_top=//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain

- build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
+ build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=1"

- build --copt=-mavx --copt=-DEIGEN_MAX_ALIGN_BYTES=64
+ build --copt=-DEIGEN_MAX_ALIGN_BYTES=64

In WORKSPACE

- PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
+ # PROTOC_SHA256 = "15e395b648a1a6dda8fd66868824a396e9d3e89bc2c8648e3b9ab9801bea5d55"
+ PROTOC_SHA256 = "7877fee5793c3aafd704e290230de9348d24e8612036f1d784c8863bc790082e"

In oss_build.sh

-  if [ "$python_version" = "3.7" ]; then
+  if [ "$python_version" = "3.6" ]; then
+    export PYTHON_BIN_PATH=/usr/bin/python3.6 && export PYTHON_LIB_PATH=/usr/local/lib/python3.6/dist-packages
+    ABI=cp36
+  elif [ "$python_version" = "3.7" ]; then

-  bazel test -c opt --copt=-mavx --config=manylinux2010 --test_output=errors //reverb/cc/...
+  bazel test -c opt --copt="-march=armv8-a+crypto" --test_output=errors //reverb/cc/...

# Builds Reverb and creates the wheel package.
-  bazel build -c opt --copt=-mavx $EXTRA_OPT --config=manylinux2010 reverb/pip_package:build_pip_package
+  bazel build -c opt --copt="-march=armv8-a+crypto" $EXTRA_OPT reverb/pip_package:build_pip_package
./bazel-bin/reverb/pip_package/build_pip_package --dst $OUTPUT_DIR $PIP_PKG_EXTRA_ARGS

In reverb/cc/platform/default/repo.bzl

urls = [
   -        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-x86_64.zip" % (version, version),
   +        "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-aarch_64.zip" % (version, version),
]

In reverb/pip_package/build_pip_package.sh

-  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG} --plat manylinux2010_x86_64 > /dev/null
+  "${PYTHON_BIN_PATH}" setup.py bdist_wheel ${PKG_NAME_FLAG} ${RELEASE_FLAG} ${TF_VERSION_FLAG}  > /dev/null

Build and install

bash oss_build.sh --clean true --tf_dep_override "tensorflow=2.6.0" --release --python "3.6"
bash ./bazel-bin/reverb/pip_package/build_pip_package --dst /tmp/reverb/dist/ --release
pip3 install /tmp/reverb/dist/dm_reverb-*

Cleaning

cd ../
rm -R reverb/

Install RL-Toolkit
```
pip3 install rl-toolkit
```

Run (for Server)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 server

Run (for Agent)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 agent --db_server localhost

Run (for Learner)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 learner --db_server 192.168.1.2

Run (for Tester)

python3 -m rl_toolkit -c ./rl_toolkit/config.yaml -e MinitaurBulletEnv-v0 tester -f save/model/actor.h5

Environments

Environment	Observation space	Observation bounds	Action space	Action bounds
BipedalWalkerHardcore-v3	(24, )	[-inf, inf]	(4, )	[-1.0, 1.0]
Walker2DBulletEnv-v0	(22, )	[-inf, inf]	(6, )	[-1.0, 1.0]
AntBulletEnv-v0	(28, )	[-inf, inf]	(8, )	[-1.0, 1.0]
HalfCheetahBulletEnv-v0	(26, )	[-inf, inf]	(6, )	[-1.0, 1.0]
HopperBulletEnv-v0	(15, )	[-inf, inf]	(3, )	[-1.0, 1.0]
HumanoidBulletEnv-v0	(44, )	[-inf, inf]	(17, )	[-1.0, 1.0]
MinitaurBulletEnv-v0	(28, )	[-167.72488, 167.72488]	(8, )	[-1.0, 1.0]

Results

Environment	SAC + gSDE	SAC + gSDE + Huber loss	SAC + TQC + gSDE	SAC + TQC + gSDE + LogCosh + Reverb
BipedalWalkerHardcore-v3	13 ± 18⁽²⁾	-	228 ± 18⁽²⁾	-
Walker2DBulletEnv-v0	2270 ± 28⁽¹⁾	2732 ± 96	2535 ± 94⁽²⁾	-
AntBulletEnv-v0	3106 ± 61⁽¹⁾	3460 ± 119	3700 ± 37⁽²⁾	-
HalfCheetahBulletEnv-v0	2945 ± 95⁽¹⁾	3003 ± 226	3041 ± 157⁽²⁾	-
HopperBulletEnv-v0	2515 ± 50⁽¹⁾	2555 ± 405	2401 ± 62⁽²⁾	-
HumanoidBulletEnv-v0	-	-	-	-
MinitaurBulletEnv-v0	-	-	-	-

results rl-toolkit

Releases

SAC + gSDE + Huber loss
is stored here, branch r2.0
SAC + TQC + gSDE + LogCosh + Reverb
is stored here, branch r4.0

Frameworks: Tensorflow, Reverb, OpenAI Gym, PyBullet, WanDB, OpenCV

Changes

v4.1.1 (September 2, 2022)

update default config.yaml

v4.1.0 (February 9, 2022)

Features 🔊

.fit()
AgentCallback

v4.0.0 (February 5, 2022)

Features 🔊

Render environments to WanDB
Grouping of runs in WanDB
SampleToInsertRatio rate limiter
Global Gradient Clipping to avoid exploding gradients
Softplus for numerical stability
YAML configuration file
LogCosh instead of Huber loss
Critic network with Add layer applied on state & action branches
Custom uniform initializer
XLA (Accelerated Linear Algebra) compiler
Optimized Replay Buffer (https://github.com/deepmind/reverb/issues/90)
split into Agent, Learner, Tester and Server

Bug fixes 🛠️

Fixed creating of saving path for models
Fixed model's summary()

v3.2.4 (July 7, 2021)

Features 🔊

Reverb
setup.py (package is available on PyPI)
split into Agent, Learner and Tester
Use custom model and layer for defining Actor-Critic
MultiCritic - concatenating multiple critic networks into one network
Truncated Quantile Critics

v2.0.2 (May 23, 2021)

Features 🔊

update Dockerfile
update README.md
formatted code by Black & Flake8

v2.0.1 (April 27, 2021)

Bug fixes 🛠️

fixed Critic model

v2.0.0 (April 22, 2021)

Features 🔊

Add Huber loss
In test mode, rendering to the video file
Normalized observation by Min-max method
Remove TD3 algorithm

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

5.0.0

Jan 11, 2025

This version

4.1.1

Sep 2, 2022

4.1.0

Feb 9, 2022

4.0.0

Feb 5, 2022

3.2.5

Aug 3, 2021

3.2.4

Jul 7, 2021

3.2.3

Jun 6, 2021

3.2.2

Jun 6, 2021

3.2.1

Jun 6, 2021

3.2.0

Jun 4, 2021

3.1.9

Jun 3, 2021

3.1.8

Jun 2, 2021

3.1.7

Jun 2, 2021

3.1.6

Jun 2, 2021

3.1.5

Jun 2, 2021

3.1.4

Jun 2, 2021

3.1.3

Jun 2, 2021

3.1.2

Jun 2, 2021

3.1.1

Jun 2, 2021

3.1.0

Jun 2, 2021

3.0.9

Jun 1, 2021

3.0.8

Jun 1, 2021

3.0.7

Jun 1, 2021

3.0.6

Jun 1, 2021

3.0.5

Jun 1, 2021

3.0.4

Jun 1, 2021

3.0.3

Jun 1, 2021

3.0.2

Jun 1, 2021

3.0.1

Jun 1, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rl-toolkit-4.1.1.tar.gz (20.0 kB view details)

Uploaded Sep 2, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rl_toolkit-4.1.1-py3-none-any.whl (23.6 kB view details)

Uploaded Sep 2, 2022 Python 3

File details

Details for the file rl-toolkit-4.1.1.tar.gz.

File metadata

Download URL: rl-toolkit-4.1.1.tar.gz
Upload date: Sep 2, 2022
Size: 20.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for rl-toolkit-4.1.1.tar.gz
Algorithm	Hash digest
SHA256	`d9cf5ce00718a64b729e46e806cc92d0eee8a2433029cde3ca1834b3cc879659`
MD5	`e3722afcc7414a9d60d182a3e2324a02`
BLAKE2b-256	`8496c3ee90511a7deed55e1fe7d8b9cc0e81a107a649984662bfe05f6d3d7ca0`

See more details on using hashes here.

File details

Details for the file rl_toolkit-4.1.1-py3-none-any.whl.

File metadata

Download URL: rl_toolkit-4.1.1-py3-none-any.whl
Upload date: Sep 2, 2022
Size: 23.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for rl_toolkit-4.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`69d72edd01a2695744a79d2df1eb61d3e1c5da79dabf3a92bac4b23bb8240a79`
MD5	`57a0bea6a4377dd89ed7c8e936064c01`
BLAKE2b-256	`f9e06a09ccf4a6f72c64be95f10986fd9a1610eaa5240a7ee6aae6f31e368a35`

See more details on using hashes here.

rl-toolkit 4.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RL Toolkit

Papers

Installation with PyPI

On PC AMD64 with Ubuntu/Debian

On NVIDIA Jetson

Environments

Results

Releases

Changes

v4.1.1 (September 2, 2022)

v4.1.0 (February 9, 2022)

Features 🔊

v4.0.0 (February 5, 2022)

Features 🔊

Bug fixes 🛠️

v3.2.4 (July 7, 2021)

Features 🔊

v2.0.2 (May 23, 2021)

Features 🔊

v2.0.1 (April 27, 2021)

Bug fixes 🛠️

v2.0.0 (April 22, 2021)

Features 🔊

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes