A procedural gridworld maze requiring completion of temporally extended tasks.
Project description
Rescue Gridworld
This environment provides a Search and Rescue gridworld where the objective is to rescue all the people from the maze. To achieve this requires completion of a temporally extended sequence of tasks as follows.
- Exploration of the maze, opening locked doors to progress. Unlocking a door requires completion of the subtask sequence collect key -> unlock cupboard -> collect keycard -> unlock door.
- Talk to people discovered so they follow the agent, rescuing them. People are non-stationary within the environment moving randomly every 3 timestaps.
- The objective is to exit having rescued all the people, but it is also possible to exit without having done so.
Installation
pip install rescue_gridworld
Quick Start
You can run a random agent to test the environment by copying run_random_agent.py to a local directory and running:
python run_random_agent.py --render --steps 1000 --rooms 5
Manual Usage
import gymnasium as gym
import rescue_gridworld
env = gym.make("RescueGridworld-v0", render_mode="human", tile_size=tile_size, num_rooms=5)
obs, info = env.reset()
# ... step the env ...
Action Space
There are 9 actions in the environment:
- 0: Move up.
- 1: Move down.
- 2: Move left.
- 3: Move right.
- 4: Collect key.
- 5: Unlock cupboard.
- 6: Collect keycard.
- 7: Unlock door.
- 8: Talk to person.
Observation Space
The observation provides 2 7x7 "windows" (although this default can be changed by setting obs_window_size to any odd number when creating the environment).
- The first is a line-of-sight observation of the local environment.
- The second is a line-of-sight filtered set of "chain id's".
The Python definitions are as follows to allow the grid to be viewed as an image.:
self.observation_space = spaces.Dict(
{
"grid": spaces.Box(low=0, high=255, shape=(1, 7, 7), dtype=np.uint8),
"chain_grid": spaces.Box(low=-2, high=500, shape=(1, 7, 7), dtype=np.int16),
}
)
Rewards
Reward of 5 is given for completion of subtasks:
- Collect key.
- Unlock cupboard.
- Collect keycard.
- Unlock door.
- Talk to person.
- Exit.
An additional reward of 50 is given if all people have been saved before exiting.
A step penalty of -0.01 is applied to encourage efficiency.
Episode termination conditions
Episodes terminate after max steps, or if the agent moves onto the exit square.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rescue_gridworld-1.0.5.tar.gz.
File metadata
- Download URL: rescue_gridworld-1.0.5.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c3a7754672dba1fb1d5929cc6a6cf3905772224b489f363848ac6d51e521821
|
|
| MD5 |
854fe7c0068472c1dc927ecb5c8db5b1
|
|
| BLAKE2b-256 |
4b90a948af0d184bfc576627b6602c0fffffce7aec0c98f4a8fd9b25521099a0
|
File details
Details for the file rescue_gridworld-1.0.5-py3-none-any.whl.
File metadata
- Download URL: rescue_gridworld-1.0.5-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c12d617c8333528cb70b1de684c00ca60a243e9b7b0e97701fadb80cf85aa97
|
|
| MD5 |
1d01bbd7c7e1c83030ee83e1c7cc2365
|
|
| BLAKE2b-256 |
8c39448443d842a83593abb0bae2ec316605b78253b8c476e3e707ce7ca50304
|