Skip to main content

A procedural gridworld maze requiring completion of temporally extended tasks.

Project description

Rescue Gridworld

This environment provides a Search and Rescue gridworld where the objective is to rescue all the people from the maze. To achieve this requires completion of a temporally extended sequence of tasks as follows.

  • Exploration of the maze, opening locked doors to progress. Unlocking a door requires completion of the subtask sequence collect key -> unlock cupboard -> collect keycard -> unlock door.
  • Talk to people discovered so they follow the agent, rescuing them. People are non-stationary within the environment moving randomly every 3 timestaps.
  • The objective is to exit having rescued all the people, but it is also possible to exit without having done so.

Installation

pip install rescue_gridworld

Quick Start

You can run a random agent to test the environment by copying run_random_agent.py to a local directory and running:

python run_random_agent.py --render --steps 1000 --rooms 5

Manual Usage

import gymnasium as gym
import rescue_gridworld

env = gym.make("RescueGridworld-v0", render_mode="human", tile_size=tile_size, num_rooms=5)
obs, info = env.reset()
# ... step the env ...

Action Space

There are 9 actions in the environment:

  • 0: Move up.
  • 1: Move down.
  • 2: Move left.
  • 3: Move right.
  • 4: Collect key.
  • 5: Unlock cupboard.
  • 6: Collect keycard.
  • 7: Unlock door.
  • 8: Talk to person.

Observation Space

The observation provides 2 7x7 "windows" (although this default can be changed by setting obs_window_size to any odd number when creating the environment).

  • The first is a line-of-sight observation of the local environment.
  • The second is a line-of-sight filtered set of "chain id's".

The Python definitions are as follows to allow the grid to be viewed as an image.:

    self.observation_space = spaces.Dict(
        {
            "grid": spaces.Box(low=0, high=255, shape=(1, 7, 7), dtype=np.uint8),
            "chain_grid": spaces.Box(low=-2, high=500, shape=(1, 7, 7), dtype=np.int16),
        }
    )

Rewards

Reward of 5 is given for completion of subtasks:

  • Collect key.
  • Unlock cupboard.
  • Collect keycard.
  • Unlock door.
  • Talk to person.
  • Exit.

An additional reward of 50 is given if all people have been saved before exiting.

A step penalty of -0.01 is applied to encourage efficiency.

Episode termination conditions

Episodes terminate after max steps, or if the agent moves onto the exit square.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rescue_gridworld-1.0.5.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rescue_gridworld-1.0.5-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file rescue_gridworld-1.0.5.tar.gz.

File metadata

  • Download URL: rescue_gridworld-1.0.5.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for rescue_gridworld-1.0.5.tar.gz
Algorithm Hash digest
SHA256 8c3a7754672dba1fb1d5929cc6a6cf3905772224b489f363848ac6d51e521821
MD5 854fe7c0068472c1dc927ecb5c8db5b1
BLAKE2b-256 4b90a948af0d184bfc576627b6602c0fffffce7aec0c98f4a8fd9b25521099a0

See more details on using hashes here.

File details

Details for the file rescue_gridworld-1.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for rescue_gridworld-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 9c12d617c8333528cb70b1de684c00ca60a243e9b7b0e97701fadb80cf85aa97
MD5 1d01bbd7c7e1c83030ee83e1c7cc2365
BLAKE2b-256 8c39448443d842a83593abb0bae2ec316605b78253b8c476e3e707ce7ca50304

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page