Skip to main content

Egocentric 3D Safe Reinforcement Learning Benchmark

Project description

HASARD: A Benchmark for Harnessing Safe Reinforcement Learning with Doom

HASARD (Harnessing Safe Reinforcement Learning with Doom) is a benchmark for Safe Reinforcement Learning within complex, egocentric perception 3D environments derived from the classic DOOM video game. It features 6 diverse scenarios each spanning across 3 levels of difficulty. A short demo of HASARD is available on Youtube.

Scenario Level 1 Level 2 Level 3
Armament Burden Level 1 Level 2 Level 3
Detonator’s Dilemma Level 1 Level 2 Level 3
Volcanic Venture Level 1 Level 2 Level 3
Precipice Plunge Level 1 Level 2 Level 3
Collateral Damage Level 1 Level 2 Level 3
Remedy Rush Level 1 Level 2 Level 3

Key Features

  • Egocentric Perception: Agents learn solely from first-person pixel observations under partial observability.
  • Beyond Simple Navigation: Whereas prior benchmarks merely require the agent to reach goal locations on flat surfaces while avoiding obstacles, HASARD necessitates comprehending complex environment dynamics, anticipating the movement of entities, and grasping spatial relationships.
  • Dynamic Environments: HASARD features random spawns, unpredictably moving units, and terrain that is constantly moving or periodically changing.
  • Difficulty Levels: Higher levels go beyond parameter adjustments, introducing entirely new elements and mechanics.
  • Reward-Cost Trade-offs: Rewards and costs are closely intertwined, with tightening cost budget necessitating a sacrifice of rewards.
  • Safety Constraints: Each scenario features a hard constraint setting, where any error results in immediate in-game penalties.
  • Focus on Safety: Achieving high rewards is straightforward, but doing so while staying within the safety budget demands learning complex and nuanced behaviors.

Policy Visualization

HASARD enables overlaying a heatmap of the agent's most frequently visited locations providing further insights into its policy and behavior within the environment. These examples show how an agent navigates Volcanic Venture, Remedy Rush, and Armament Burden during the course of training:

Augmented Observations

HASARD supports augmented observation modes for further visual analysis. By utilizing privileged game state information, it can generate simplified observation representations, such as segmenting objects in the scene or rendering the environment displaying only depth from surroundings.

Installation

To install HASARD, simply clone or download the repository and run:

$ pip install .

Getting Started

Below we provide a short code snippet to run a HASARD task.

import hasard

env = hasard.make('ArmamentBurdenLevel1-v0')
env.reset()
terminated = truncated = False
steps = total_cost = total_reward = 0
while not (terminated or truncated):
    action = env.action_space.sample()
    state, reward, cost, terminated, truncated, info = env.step(action)
    env.render()
    steps += 1
    total_cost += cost
    total_reward += reward
print(f"Episode finished in {steps} steps. Reward: {total_reward:.2f}. Cost: {total_cost:.2f}")
env.close()

Acknowledgements

HASARD environments are built on top of the ViZDoom platform.
Our Safe RL baseline methods are implemented in Sample-Factory.
Our experiments were managed using WandB.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HASARD-0.1.0.tar.gz (88.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

HASARD-0.1.0-py3-none-any.whl (89.1 MB view details)

Uploaded Python 3

File details

Details for the file HASARD-0.1.0.tar.gz.

File metadata

  • Download URL: HASARD-0.1.0.tar.gz
  • Upload date:
  • Size: 88.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for HASARD-0.1.0.tar.gz
Algorithm Hash digest
SHA256 84676e9b30e2d8a2165650eb06cca820d3ec3a0bb759dd78dca6fdeae976fa11
MD5 d118727f3b181e79017d522aee1f1134
BLAKE2b-256 54c4db69023ccaa120394d13f308425c370779fb1c2857fb37c10095d7b43f85

See more details on using hashes here.

File details

Details for the file HASARD-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: HASARD-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 89.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for HASARD-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6469172d60171a06c7de2c4ffa8ca8c6018b07d6b5dbcbea3e6911a2e4473184
MD5 ebac3d327d6d9f8c38cdd7966f15bb6a
BLAKE2b-256 0d221b7a76e58a96f6e2f3bc57eb4f620f16ac25bb360e67cea00e578dc3faa1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page