Egocentric 3D Safe Reinforcement Learning Benchmark

These details have not been verified by PyPI

Project links

Homepage

Project description

HASARD: A Benchmark for Harnessing Safe Reinforcement Learning with Doom

HASARD (Harnessing Safe Reinforcement Learning with Doom) is a benchmark for Safe Reinforcement Learning within complex, egocentric perception 3D environments derived from the classic DOOM video game. It features 6 diverse scenarios each spanning across 3 levels of difficulty.

🔗 Useful Links

Scenario	Level 1	Level 2	Level 3
Armament Burden
Detonator’s Dilemma
Volcanic Venture
Precipice Plunge
Collateral Damage
Remedy Rush

Key Features

Egocentric Perception: Agents learn solely from first-person pixel observations under partial observability.
Beyond Simple Navigation: Whereas prior benchmarks merely require the agent to reach goal locations on flat surfaces while avoiding obstacles, HASARD necessitates comprehending complex environment dynamics, anticipating the movement of entities, and grasping spatial relationships.
Dynamic Environments: HASARD features random spawns, unpredictably moving units, and terrain that is constantly moving or periodically changing.
Difficulty Levels: Higher levels go beyond parameter adjustments, introducing entirely new elements and mechanics.
Reward-Cost Trade-offs: Rewards and costs are closely intertwined, with tightening cost budget necessitating a sacrifice of rewards.
Safety Constraints: Each scenario features a hard constraint setting, where any error results in immediate in-game penalties.
Focus on Safety: Achieving high rewards is straightforward, but doing so while staying within the safety budget demands learning complex and nuanced behaviors.

Policy Visualization

HASARD enables overlaying a heatmap of the agent's most frequently visited locations providing further insights into its policy and behavior within the environment. These examples show how an agent navigates Volcanic Venture, Remedy Rush, and Armament Burden during the course of training:

Augmented Observations

HASARD supports augmented observation modes for further visual analysis. By utilizing privileged game state information, it can generate simplified observation representations, such as segmenting objects in the scene or rendering the environment displaying only depth from surroundings.

Installation

HASARD supports modular installation to install only the dependencies you need:

# Core dependencies only (environments and basic functionality)
pip install HASARD

# With sample-factory support for training RL agents
pip install HASARD[sample-factory]

# With results analysis and plotting tools
pip install HASARD[results]

# Full installation with all optional dependencies
pip install HASARD[sample-factory,results]

To install from source:

git clone https://github.com/TTomilin/HASARD
cd HASARD
pip install .  # or pip install .[sample-factory,results] for extras

Getting Started

To get started with HASARD, here's a minimal example of running a task environment. This script can also be found in run_env.py:

import hasard

env = hasard.make('RemedyRushLevel1-v0')
env.reset()
terminated = truncated = False
steps = total_cost = total_reward = 0
while not (terminated or truncated):
    action = env.action_space.sample()
    state, reward, terminated, truncated, info = env.step(action)
    env.render()
    steps += 1
    total_cost += info['cost']
    total_reward += reward
print(f"Episode finished in {steps} steps. Reward: {total_reward:.2f}. Cost: {total_cost:.2f}")
env.close()

Training

For highly parallelized training of Safe RL agents on HASARD environments, and to reproduce the results from the paper, refer to sample_factory for detailed usage instructions and examples.

Acknowledgements

HASARD environments are built on top of the ViZDoom platform.
Our Safe RL baseline methods are implemented in Sample-Factory.
Our experiments were managed using WandB.

Citation

If you use our work in your research, please cite it as follows:

@inproceedings{tomilin2025hasard,
    title={HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents},
    author={T. Tomilin, M. Fang, and M. Pechenizkiy},
    booktitle={The Thirteenth International Conference on Learning Representations},
    year={2025}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.2.0

Aug 26, 2025

0.1.0

Feb 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HASARD-0.2.0.tar.gz (88.4 MB view details)

Uploaded Aug 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

HASARD-0.2.0-py3-none-any.whl (88.8 MB view details)

Uploaded Aug 26, 2025 Python 3

File details

Details for the file HASARD-0.2.0.tar.gz.

File metadata

Download URL: HASARD-0.2.0.tar.gz
Upload date: Aug 26, 2025
Size: 88.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for HASARD-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`df698fa83370e72867a82084a481d998dac4007af8a8d4c39b7697e7a1382852`
MD5	`bfe1a038976c49088cc777ab5b397dc3`
BLAKE2b-256	`a70c1507bd1bc482227a0079d6d398de03f186fdea364513de94d0a77413b73f`

See more details on using hashes here.

File details

Details for the file HASARD-0.2.0-py3-none-any.whl.

File metadata

Download URL: HASARD-0.2.0-py3-none-any.whl
Upload date: Aug 26, 2025
Size: 88.8 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for HASARD-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5f909ce0af8faa49cdb4fd10f58a7a047fa7889155bf85e3c19133e5c9aa22ea`
MD5	`e9a7a3d05b35d05b19fff56db6912d9e`
BLAKE2b-256	`0021004aef74b8b27beb66803c1e41de7d4a1cab85bb10777922fa3a2f98c569`

See more details on using hashes here.

HASARD 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

HASARD: A Benchmark for Harnessing Safe Reinforcement Learning with Doom

🔗 Useful Links

Key Features

Policy Visualization

Augmented Observations

Installation

Getting Started

Training

Acknowledgements

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes