Skip to main content

Two level RL for robots navigation

Project description

Main Tests

Highrl: Multi-level Reinforcement Learning for Robotics Navigation

Highrl is a library for training robots using RL under the scheme of multi-level RL. The library has numerous features from generating random environment, training agents to generate curriculum learning schemes, or train robots in pre-defined environments.

The robot can explore a synthetic gym environment using lidar vision (either in flat or rings format). The robot is trained to reach the goal with obstacles in between to hinder its movement and simulate the real-world environment. A teacher (an RL agent) will be trained to synthesize the perfect curriculum learning for the robot, so that the robot will solve maps with certain difficulties in minimal time.

The robot model is implemented with a CNN for the feature extractor and an MLP for both the value and policy networks. The teacher model is implemented with an LSTM network for the feature extractor and an MLP for value/policy network. The robot model is fed with the lidar data and outputs the velocity (vx, vy) of the robot. The teacher model is fed with data of the last session for the robot that the teacher is training, and outputs the configurations for the next environment to train the robot. At each step of the teacher, a new robot will be generated with probability of 10% and will be trained for a fixed number of steps. You can find the models in src/highrl/policy.

Installation

Please note that the library is only tested on Linux distributions. If you want to install it on Windows, you can use WSL.

Use the package manager pip to install higrl library.

pip install highrl

Usage

highrl -h # get available arguments

Configurations

--robot-config: path of configuration file of robot environment (relative path).

--teacher-config: path of configuration file of teacher environment (relative path).

--mode: choose train or test mode.

--env-mode: choose whether to train the robot alone or in the presence of a teacher to generate curriculum learning.

--render-each: the frequency of rendering for robot environment (integer).

--output-path: relative path to output results for robot mode.

--lidar-mode: mode to process lidar flat=1D, rings=2D.

Example

highrl --render-each=50 --output-dir=~/Desktop

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

highrl-1.4.0.tar.gz (155.8 kB view details)

Uploaded Source

Built Distribution

highrl-1.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (453.4 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

File details

Details for the file highrl-1.4.0.tar.gz.

File metadata

  • Download URL: highrl-1.4.0.tar.gz
  • Upload date:
  • Size: 155.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for highrl-1.4.0.tar.gz
Algorithm Hash digest
SHA256 2623b9f4e97d7d870c594506e690f4160450586d02f33be12f02cadd71adc0cf
MD5 41b9715a3087b81730fd49d949fbff06
BLAKE2b-256 689ae6d0aabc353159ee8cd2505230b91877ed53f676d408efca5a4c48f1d4fc

See more details on using hashes here.

File details

Details for the file highrl-1.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for highrl-1.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7b0bd095e5882747b3c26b21cbe60ab1639d5dbe1c2d70aeac4a70ddbb2aef21
MD5 d0d6aab8035be8a8cffd77c2ff78fdba
BLAKE2b-256 d3e171a79979a5051605c91f25b775fde194e5b3938729e5e52042d9c3abd291

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page