Skip to main content

A comprehensive and reliable benchmark for safe reinforcement learning.

Reason this release was yanked:

Some inevitable bugs appear in this v0.2.0.

Project description

Organization PyPI Documentation Status Downloads GitHub Repo Stars codestyle License CodeCov

Documentation | Implemented Algorithms | Installation | Getting Started | License


This library is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open an github issue or reach out. We'd love to hear about how you're using the library.

OmniSafe is an infrastructural framework designed to accelerate safe reinforcement learning (RL) research by providing a comprehensive and reliable benchmark for safe RL algorithms. The field of RL has great potential to benefit society, but safety concerns are a significant issue, and RL algorithms have raised concerns about unintended harm or unsafe behavior. The intention of safe RL is to develop algorithms that minimize the risk of unintended harm or unsafe behavior, but there is currently a lack of commonly recognized safe RL algorithm benchmarks.

OmniSafe addresses these issues by providing more than 40 experimentally validated algorithms and a sound and efficient simulation environment. Researchers can use OmniSafe to conduct experiments and verify their ideas, ensuring consistency and enabling more efficient development of safe RL algorithms. By using OmniSafe as a benchmark, researchers can evaluate the performance of their own safe RL algorithms and contribute to the advancement of safe RL research.


Table of Contents


Implemented Algorithms

The supported interface algorithms currently include:

Latest SafeRL Papers

List of Algorithms

On-Policy Safe

Off-Policy Safe

Model-Based Safe

Offline Safe

Others


Installation

Prerequisites

OmniSafe requires Python 3.8+ and PyTorch 1.10+.

Install from source

# Clone the repo
git clone https://github.com/PKU-MARL/omnisafe
cd omnisafe

# Create a conda environment
conda create -n omnisafe python=3.8
conda activate omnisafe

# Install omnisafe
pip install -e .

Install from PyPI

OmniSafe is hosted in PyPI / Status.

pip install omnisafe

Examples

cd examples
python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 --total-steps 1024000 --device cpu --vector-env-nums 1 --torch-threads 1

Try with CLI

A video example

Segmentfault

pip install omnisafe

omnisafe --help # Ask for help

omnisafe benchmark --help # The benchmark also can be replaced with 'eval', 'train', 'train-config'

# Quick benchmarking for your research, just specify: 1.exp_name, 2.num_pool(how much processes are concurrent), 3.path of the config file(refer to omnisafe/examples/benchmarks for format)
omnisafe benchmark test_benchmark 2 "./saved_source/benchmark_config.yaml"

# Quick evaluating and rendering your trained policy, just specify: 1.path of algorithm which you trained
omnisafe eval ./saved_source/PPO-{SafetyPointGoal1-v0} "--num-episode" "1"

# Quick training some algorithms to validate your thoughts
# Note: use `key1:key2`, your can select key of hyperparameters which are recursively contained, and use `--custom-cfgs`, you can add custom cfgs via CLI
omnisafe train --algo PPO --total-steps 1024 --vector-env-nums 1 --custom-cfgs algo_cfgs:update_cycle --custom-cfgs 512

# Quick training some algorithms via a saved config file, the format is as same as default format
omnisafe train-config "./saved_source/train_config.yaml"

algo:

Type Name
Base-On-Policy PolicyGradient, PPO
NaturalPG, TRPO
Base-Off-Policy DDPG, TD3, SAC
Naive Lagrange RCPO, PPOLag, TRPOLag
DDPGLag, TD3Lag, SACLag
PID Lagrange CPPOPid, TRPOPid
First Order FOCOPS, CUP
Second Order SDDPG, CPO, PCPO
Saute RL PPOSaute, PPOLagSaute
Simmer RL PPOSimmerQ, PPOSimmerPid
PPOLagSimmerQ, PPOLagSimmerPid
EarlyTerminated PPOEarlyTerminated
PPOLagEarlyTerminated
Model-Based CAP, MBPPOLag, SafeLOOP

env-id: Environment id in Safety Gymnasium, here a list of envs that safety-gymnasium supports.

Category Task Agent Example
Safe Navigation Goal[012] Point, Car, Racecar, Ant SafetyPointGoal1-v0
Button[012]
Push[012]
Circle[012]
Safe Velocity Velocity HalfCheetah, Hopper, Swimmer, Walker2d, Ant, Humanoid SafetyHumanoidVelocity-v4

More information about environments, please refer to Safety Gymnasium

parallel: Number of parallels


Getting Started

Important Hints

  • train_cfgs:torch_threads is especialy important for trainning speed, and is varying with users' machine, this value shouldn't be too small or too large.

1. Run Agent from preset yaml file

import omnisafe

env_id = 'SafetyPointGoal1-v0'
agent = omnisafe.Agent('PPOLag', env_id)
agent.learn()

2. Run agent with custom cfg

import omnisafe

env_id = 'SafetyPointGoal1-v0'
custom_cfgs = {
    'train_cfgs': {
        'total_steps': 1024000,
        'vector_env_nums': 1,
        'parallel': 1,
    },
    'algo_cfgs': {
        'update_cycle': 2048,
        'update_iters': 1,
    },
    'logger_cfgs': {
        'use_wandb': False,
    },
}
agent = omnisafe.Agent('PPOLag', env_id, custom_cfgs=custom_cfgs)
agent.learn()

3. Run Agent from custom terminal config

You can also run agent from custom terminal config. You can set any config in corresponding yaml file.

For example, you can run PPOLag agent on SafetyPointGoal1-v0 environment with total_steps=1024000, vector_env_nums=1 and parallel=1 by:

cd examples
python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 --total-steps 1024000 --device cpu --vector-env-nums 1 --torch-threads 1

4. Evalutate Saved Policy

import os

import omnisafe


# Just fill your experiment's log directory in here.
# Such as: ~/omnisafe/runs/SafetyPointGoal1-v0/CPO/seed-000-2022-12-25_14-45-05
LOG_DIR = ''

evaluator = omnisafe.Evaluator()
for item in os.scandir(os.path.join(LOG_DIR, 'torch_save')):
    if item.is_file() and item.name.split('.')[-1] == 'pt':
        evaluator.load_saved_model(save_dir=LOG_DIR, model_name=item.name)
        evaluator.render(num_episode=10, camera_name='track', width=256, height=256)

Changelog

See CHANGELOG.md.

The OmniSafe Team

OmniSafe is mainly developed by the SafeRL research team directed by Prof. Yaodong Yang. Our SafeRL research team members include: Borong Zhang, Jiayi Zhou, JTao Dai, Weidong Huang, Ruiyang Sun ,Xuehai Pan, Jiaming Ji. If you have any question in the process of using omnisafe, don't hesitate to ask your question in the GitHub issue page, we will reply you in 2-3 working days.

License

OmniSafe is released under Apache License 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnisafe-0.2.0.tar.gz (91.1 kB view details)

Uploaded Source

Built Distribution

omnisafe-0.2.0-py2.py3-none-any.whl (156.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file omnisafe-0.2.0.tar.gz.

File metadata

  • Download URL: omnisafe-0.2.0.tar.gz
  • Upload date:
  • Size: 91.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for omnisafe-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ffba319bcac52eab5f811938848079b6a4552c6757245d175b3a9f6e0560902b
MD5 598be2a3e3d1f2d79e429e5b43582569
BLAKE2b-256 68962d0b294134c142653a6f56a54ea7f01d0af0ac5085ab39b892dd7a2f353b

See more details on using hashes here.

File details

Details for the file omnisafe-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: omnisafe-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 156.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for omnisafe-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7bfad1d6accb4273dbef7314351c778cfce8b4ad41d930f36ae96abf5f322255
MD5 7608e047db1a513aa9f2f5473cfc2f4e
BLAKE2b-256 b625d867508da38eb1beb8a75d113f427b1ebd16619b67d775f61cfb34c6f98f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page