Skip to main content

A procedural gridworld maze requiring completion of temporally extended tasks.

Project description

Rescue Gridworld

This environment provides a Search and Rescue gridworld where the objective is to rescue all the people from the maze. To achieve this requires completion of a temporally extended sequence of tasks as follows.

  • Exploration of the maze, opening locked doors to progress. Unlocking a door requires completion of the subtask sequence collect key -> unlock cupboard -> collect keycard -> unlock door.
  • Talk to people discovered so they follow the agent, rescuing them. People are non-stationary within the environment moving randomly every 3 timestaps.
  • The objective is to exit having rescued all the people, but it is also possible to exit without having done so.

Installation

pip install rescue_gridworld

Quick Start

You can run a random agent to test the environment by copying run_random_agent.py to a local directory and running:

python run_random_agent.py --render --steps 1000 --rooms 5

Manual Usage

import gymnasium as gym
import rescue_gridworld

env = gym.make("RescueGridworld-v0", render_mode="human", tile_size=tile_size, num_rooms=5)
obs, info = env.reset()
# ... step the env ...

Action Space

There are 9 actions in the environment:

  • 0: Move up.
  • 1: Move down.
  • 2: Move left.
  • 3: Move right.
  • 4: Collect key.
  • 5: Unlock cupboard.
  • 6: Collect keycard.
  • 7: Unlock door.
  • 8: Talk to person.

Observation Space

The observation provides 2 7x7 "windows".

  • The first is a line-of-sight observation of the local environment.
  • The second is a line-of-sight filtered set of "chain id's".

The Python definitions are as follows to allow the grid to be viewed as an image.:

    self.observation_space = spaces.Dict(
        {
            "grid": spaces.Box(low=0, high=255, shape=(1, 7, 7), dtype=np.uint8),
            "chain_grid": spaces.Box(low=-2, high=500, shape=(1, 7, 7), dtype=np.int16),
        }
    )

Rewards

Reward of 5 is given for completion of subtasks:

  • Collect key.
  • Unlock cupboard.
  • Collect keycard.
  • Unlock door.
  • Talk to person.
  • Exit.

An additional reward of 50 is given if all people have been saved before exiting.

A step penalty of -0.01 is applied to encourage efficiency.

Episode termination conditions

Episodes terminate after max steps, or if the agent moves onto the exit square.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rescue_gridworld-1.0.2.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rescue_gridworld-1.0.2-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file rescue_gridworld-1.0.2.tar.gz.

File metadata

  • Download URL: rescue_gridworld-1.0.2.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for rescue_gridworld-1.0.2.tar.gz
Algorithm Hash digest
SHA256 c096ee8fb235166b10ff7b4949ff6f4bf8885359a5d203dbd9bfe5e76a19560b
MD5 7f5bb98a80595b58a2fdb1e0a9f18f13
BLAKE2b-256 1115977f6964b6c53f725727e0b20371c749c54d0bac16d0280606a4b2ea0be0

See more details on using hashes here.

File details

Details for the file rescue_gridworld-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for rescue_gridworld-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ea1f3857c2aa5052ae522b3ad8fd3e5d3e1cf35d28c826f9f38cdf05f2dbfc3b
MD5 8c6c1ae59129cc18876af978725427e1
BLAKE2b-256 be46e04a60ccb39723c0eff70682424faede28dfb99027689930a969a54e5fb8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page