Skip to main content

Elegant Implementations of Offline Safe Reinforcement Learning Algorithms

Project description


Python 3.8+ License PyPI GitHub Repo Stars Downloads


OSRL (Offline Safe Reinforcement Learning) offers a collection of elegant and extensible implementations of state-of-the-art offline safe reinforcement learning (RL) algorithms. Aimed at propelling research in offline safe RL, OSRL serves as a solid foundation to implement, benchmark, and iterate on safe RL solutions.

The OSRL package is a crucial component of our larger benchmarking suite for offline safe learning, which also includes DSRL and FSRL, and is built to facilitate the development of robust and reliable offline safe RL solutions.

To learn more, please visit our project website.

Structure

The structure of this repo is as follows:

├── examples
│   ├── configs  # the training configs of each algorithm
│   ├── eval     # the evaluation escipts
│   ├── train    # the training scipts
├── osrl
│   ├── algorithms  # offline safe RL algorithms
│   ├── common      # base networks and utils

The implemented offline safe RL and imitation learning algorithms include:

Algorithm Type Description
BCQ-Lag Q-learning BCQ with PID Lagrangian
BEAR-Lag Q-learning BEARL with PID Lagrangian
CPQ Q-learning Constraints Penalized Q-learning (CPQ))
COptiDICE Distribution Correction Estimation Offline Constrained Policy Optimization via stationary DIstribution Correction Estimation
CDT Sequential Modeling Constrained Decision Transformer
BC-All Imitation Learning Behavior Cloning with all datasets
BC-Safe Imitation Learning Behavior Cloning with safe trajectories
BC-Frontier Imitation Learning Behavior Cloning with high-reward trajectories

Installation

Pull the repo and install:

git clone https://github.com/liuzuxin/OSRL.git
cd osrl
pip install -e .
pip install OApackage==2.7.6

How to use OSRL

The example usage are in the examples folder, where you can find the training and evaluation scripts for all the algorithms. All the parameters and their default configs for each algorithm are available in the examples/configs folder. OSRL uses the WandbLogger in FSRL. The offline dataset and offline environments are provided in DSRL, so make sure you install both of them first.

Training

For example, to train the bcql method, simply run by overriding the default parameters:

python examples/train/train_bcql.py --task OfflineCarCirvle-v0 --param1 args1 ...

By default, the config file and the logs during training will be written to logs\ folder and the training plots can be viewed online using Wandb.

You can also launch a sequence of experiments or in parallel via the EasyRunner package, see examples/train_all_tasks.py for details.

Evaluation

To evaluate a trained agent, for example, a BCQ agent, simply run

python example/eval/eval_bcql.py --path path_to_model --eval_episodes 20

It will load config file from path_to_model/config.yaml and model file from path_to_model/checkpoints/model.pt, run 20 episodes, and print the average normalized reward and cost.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osrl-lib-0.1.0.tar.gz (36.2 kB view details)

Uploaded Source

Built Distribution

osrl_lib-0.1.0-py3-none-any.whl (43.1 kB view details)

Uploaded Python 3

File details

Details for the file osrl-lib-0.1.0.tar.gz.

File metadata

  • Download URL: osrl-lib-0.1.0.tar.gz
  • Upload date:
  • Size: 36.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.15

File hashes

Hashes for osrl-lib-0.1.0.tar.gz
Algorithm Hash digest
SHA256 238e248763f7fb9176c8a35cf3bd4774f3bc2eb1752867630a0008a0780784c1
MD5 a652bdc9efde487a667682ef8b0a5b57
BLAKE2b-256 62e18943331e24b5f5060e47f404d431268c665a68b9005400fb98b40ab2605b

See more details on using hashes here.

File details

Details for the file osrl_lib-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: osrl_lib-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 43.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.15

File hashes

Hashes for osrl_lib-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 724925792490fff923c23e52061b52d82d590ba06225c8c9a323a813b0e40cfc
MD5 ea30b18e4f87002c95c6163395183f43
BLAKE2b-256 54b70f3b240e6a5805ff0d5730d8c1b3d27944dba4e55ed08f378db630fbcb95

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page