omnisafe

A comprehensive and reliable benchmark for safe reinforcement learning.

These details have not been verified by PyPI

Project links

Reason this release was yanked:

Some inevitable bugs appear in this v0.2.1.

Project description

Documentation | Implemented Algorithms | Installation | Getting Started | License

This library is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open a GitHub issue or reach out. We'd love to hear about how you're using the library.

OmniSafe is an infrastructural framework designed to accelerate safe reinforcement learning (RL) research by providing a comprehensive and reliable benchmark for safe RL algorithms. The field of RL has great potential to benefit society, but safety concerns are a significant issue, and RL algorithms have raised concerns about unintended harm or unsafe behavior. Safe RL intends to develop algorithms that minimize the risk of unintended harm or unsafe behavior, but there is currently a lack of commonly recognized safe RL algorithm benchmarks.

OmniSafe addresses these issues by providing more than 40 experimentally validated algorithms and a sound and efficient simulation environment. Researchers can use OmniSafe to conduct experiments and verify their ideas, ensuring consistency and enabling more efficient development of safe RL algorithms. By using OmniSafe as a benchmark, researchers can evaluate the performance of their own safe RL algorithms and contribute to the advancement of safe RL research.

Implemented Algorithms
- Latest SafeRL Papers
- List of Algorithms
Installation
- Prerequisites
  - Install from source
  - Install from PyPI
- Examples
  - Try with CLI
Getting Started
Changelog
The OmniSafe Team
License

Implemented Algorithms

The supported interface algorithms currently include:

Latest SafeRL Papers

[AAAI 2023] Augmented Proximal Policy Optimization for Safe Reinforcement Learning (APPO)
[NeurIPS 2022] Constrained Update Projection Approach to Safe Policy Optimization (CUP)
[NeurIPS 2022] Effects of Safety State Augmentation on Safe Exploration (Simmer)
[NeurIPS 2022] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
[ICML 2022] Sauté RL: Almost Surely Safe Reinforcement Learning Using State Augmentation (SauteRL)
[ICML 2022] Constrained Variational Policy Optimization for Safe Reinforcement Learning (CVPO)
[IJCAI 2022] Penalized Proximal Policy Optimization for Safe Reinforcement Learning
[ICLR 2022] Constrained Policy Optimization via Bayesian World Models (LA-MBDA)
[AAAI 2022] Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (CAP)

List of Algorithms

Installation

Prerequisites

OmniSafe requires Python 3.8+ and PyTorch 1.10+.

Install from source

# Clone the repo
git clone https://github.com/PKU-MARL/omnisafe
cd omnisafe

# Create a conda environment
conda create -n omnisafe python=3.8
conda activate omnisafe

# Install omnisafe
pip install -e .

Install from PyPI

OmniSafe is hosted in / Status .

pip install omnisafe

Examples

cd examples
python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 --total-steps 1024000 --device cpu --vector-env-nums 1 --torch-threads 1

algo:

Type	Name
`Base-On-Policy`	`PolicyGradient, PPO` `NaturalPG, TRPO`
`Base-Off-Policy`	`DDPG, TD3, SAC`
`Naive Lagrange`	`RCPO, PPOLag, TRPOLag` `DDPGLag, TD3Lag, SACLag`
`PID Lagrange`	`CPPOPid, TRPOPid`
`First Order`	`FOCOPS, CUP`
`Second Order`	`SDDPG, CPO, PCPO`
`Saute RL`	`PPOSaute, PPOLagSaute`
`Simmer RL`	`PPOSimmerQ, PPOSimmerPid` `PPOLagSimmerQ, PPOLagSimmerPid`
`EarlyTerminated`	`PPOEarlyTerminated` `PPOLagEarlyTerminated`
`Model-Based`	`CAP, MBPPOLag, SafeLOOP`

env-id: Environment id in Safety Gymnasium, here a list of envs that safety-gymnasium supports.

Category	Task	Agent	Example
Safe Navigation	Goal[012]	Point, Car, Racecar, Ant	SafetyPointGoal1-v0
	Button[012]
	Push[012]
	Circle[012]
Safe Velocity	Velocity	HalfCheetah, Hopper, Swimmer, Walker2d, Ant, Humanoid	SafetyHumanoidVelocity-v4

More information about environments, please refer to Safety Gymnasium

parallel: Number of parallels

Try with CLI

A video example

Segmentfault

pip install omnisafe

omnisafe --help # Ask for help

omnisafe benchmark --help # The benchmark also can be replaced with 'eval', 'train', 'train-config'

# Quick benchmarking for your research, just specify: 1.exp_name, 2.num_pool(how much processes are concurrent), 3.path of the config file(refer to omnisafe/examples/benchmarks for format)
omnisafe benchmark test_benchmark 2 ./saved_source/benchmark_config.yaml

# Quick evaluating and rendering your trained policy, just specify: 1.path of algorithm which you trained
omnisafe eval ./saved_source/PPO-{SafetyPointGoal1-v0} --num-episode 1

# Quick training some algorithms to validate your thoughts
# Note: use `key1:key2`, your can select key of hyperparameters which are recursively contained, and use `--custom-cfgs`, you can add custom cfgs via CLI
omnisafe train --algo PPO --total-steps 1024 --vector-env-nums 1 --custom-cfgs algo_cfgs:update_cycle --custom-cfgs 512

# Quick training some algorithms via a saved config file, the format is as same as default format
omnisafe train-config ./saved_source/train_config.yaml

Getting Started

Important Hints

train_cfgs:torch_threads is especially important for training speed, and is varying with users' machine, this value shouldn't be too small or too large.

1. Run Agent from preset yaml file

import omnisafe

env_id = 'SafetyPointGoal1-v0'
agent = omnisafe.Agent('PPOLag', env_id)
agent.learn()

2. Run agent with custom cfg

import omnisafe

env_id = 'SafetyPointGoal1-v0'
custom_cfgs = {
    'train_cfgs': {
        'total_steps': 1024000,
        'vector_env_nums': 1,
        'parallel': 1,
    },
    'algo_cfgs': {
        'update_cycle': 2048,
        'update_iters': 1,
    },
    'logger_cfgs': {
        'use_wandb': False,
    },
}
agent = omnisafe.Agent('PPOLag', env_id, custom_cfgs=custom_cfgs)
agent.learn()

3. Run Agent from custom terminal config

You can also run agent from a custom terminal config. You can set any config in a corresponding yaml file.

For example, you can run PPOLag agent on SafetyPointGoal1-v0 environment with total_steps=1024000, vector_env_nums=1 and parallel=1 by:

cd examples
python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 --total-steps 1024000 --device cpu --vector-env-nums 1 --torch-threads 1

4. Evalutate Saved Policy

import os

import omnisafe


# Just fill your experiment's log directory in here.
# Such as: ~/omnisafe/runs/SafetyPointGoal1-v0/CPO/seed-000-2022-12-25_14-45-05
LOG_DIR = ''

evaluator = omnisafe.Evaluator()
for item in os.scandir(os.path.join(LOG_DIR, 'torch_save')):
    if item.is_file() and item.name.split('.')[-1] == 'pt':
        evaluator.load_saved_model(save_dir=LOG_DIR, model_name=item.name)
        evaluator.render(num_episode=10, camera_name='track', width=256, height=256)

Changelog

See CHANGELOG.md.

The OmniSafe Team

OmniSafe is mainly developed by the SafeRL research team directed by Prof. Yaodong Yang. Our SafeRL research team members include Borong Zhang, Jiayi Zhou, JTao Dai, Weidong Huang, Ruiyang Sun, Xuehai Pan and Jiaming Ji. If you have any questions in the process of using omnisafe, don't hesitate to ask your questions on the GitHub issue page, we will reply to you in 2-3 working days.

License

OmniSafe is released under Apache License 2.0.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.1b0 pre-release

May 4, 2024

0.5.0

May 4, 2024

0.5.0b0 pre-release

Jun 1, 2023

0.4.0

May 8, 2023

0.3.0

Apr 1, 2023

0.2.2

Mar 27, 2023

This version

0.2.1 yanked

Mar 27, 2023

Reason this release was yanked:

Some inevitable bugs appear in this v0.2.1.

0.2.0 yanked

Mar 26, 2023

Reason this release was yanked:

Some inevitable bugs appear in this v0.2.0.

0.1.0 yanked

Mar 15, 2023

Reason this release was yanked:

Some inevitable bugs appear in this v0.1.0.

0.0.1 yanked

Sep 26, 2022

Reason this release was yanked:

Some inevitable bugs appear in this v0.0.1.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnisafe-0.2.1.tar.gz (99.5 kB view hashes)

Uploaded Mar 27, 2023 Source

Built Distribution

omnisafe-0.2.1-py2.py3-none-any.whl (166.0 kB view hashes)

Uploaded Mar 27, 2023 Python 2 Python 3

Hashes for omnisafe-0.2.1.tar.gz

Hashes for omnisafe-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`73f02f159d04ba660177ab9a168fc9335b2e9b3aaaa6c036e9c2ae563b72f465`
MD5	`04ad730cf681b43fbea07e64c2966add`
BLAKE2b-256	`a6ab6324399e9e0252754b66d0a48e3e62dac98dbb21c2bfea974eaa57960624`

Hashes for omnisafe-0.2.1-py2.py3-none-any.whl

Hashes for omnisafe-0.2.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`489002c475cc5215727da564039430e3cd4557d45e725d3f0bf117b68e8a4fed`
MD5	`9fc39cc3246d0bdc10eacb08c24d7433`
BLAKE2b-256	`7b22fd401252b356d4bdffcf36c8ea9826fcfc67766355f29520e03b3b63229c`

omnisafe 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Table of Contents

Implemented Algorithms

Latest SafeRL Papers

List of Algorithms

On-Policy Safe

Off-Policy Safe

Model-Based Safe

Offline Safe

Others

Installation

Prerequisites

Install from source

Install from PyPI

Examples

Try with CLI

Getting Started

Important Hints

1. Run Agent from preset yaml file

2. Run agent with custom cfg

3. Run Agent from custom terminal config

4. Evalutate Saved Policy

Changelog

The OmniSafe Team

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution