Skip to main content

A procedural gridworld maze requiring completion of temporally extended tasks.

Project description

Rescue Gridworld

This environment provides a Search and Rescue gridworld where the objective is to rescue all the people from the maze. To achieve this requires completion of a temporally extended sequence of tasks as follows.

  • Exploration of the maze, opening locked doors to progress. Unlocking a door requires completion of the subtask sequence collect key -> unlock cupboard -> collect keycard -> unlock door.
  • Talk to people discovered so they follow the agent, rescuing them. People are non-stationary within the environment moving randomly every 3 timestaps.
  • The objective is to exit having rescued all the people, but it is also possible to exit without having done so.

Installation

pip install rescue_gridworld

Quick Start

You can run a random agent to test the environment by copying run_random_agent.py to a local directory and running:

python run_random_agent.py --render --steps 1000 --rooms 5

Manual Usage

import gymnasium as gym
import rescue_gridworld

env = gym.make("RescueGridworld-v0", render_mode="human", tile_size=tile_size, num_rooms=5)
obs, info = env.reset()
# ... step the env ...

Action Space

There are 9 actions in the environment:

  • 0: Move up.
  • 1: Move down.
  • 2: Move left.
  • 3: Move right.
  • 4: Collect key.
  • 5: Unlock cupboard.
  • 6: Collect keycard.
  • 7: Unlock door.
  • 8: Talk to person.

Observation Space

The observation provides 2 7x7 "windows".

  • The first is a line-of-sight observation of the local environment.
  • The second is a line-of-sight filtered set of "chain id's".

The Python definitions are as follows to allow the grid to be viewed as an image.:

    self.observation_space = spaces.Dict(
        {
            "grid": spaces.Box(low=0, high=255, shape=(1, 7, 7), dtype=np.uint8),
            "chain_grid": spaces.Box(low=-2, high=500, shape=(1, 7, 7), dtype=np.int16),
        }
    )

Rewards

Reward of 5 is given for completion of subtasks:

  • Collect key.
  • Unlock cupboard.
  • Collect keycard.
  • Unlock door.
  • Talk to person.
  • Exit.

An additional reward of 50 is given if all people have been saved before exiting.

A step penalty of -0.01 is applied to encourage efficiency.

Episode termination conditions

Episodes terminate after max steps, or if the agent moves onto the exit square.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rescue_gridworld-1.0.3.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rescue_gridworld-1.0.3-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file rescue_gridworld-1.0.3.tar.gz.

File metadata

  • Download URL: rescue_gridworld-1.0.3.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for rescue_gridworld-1.0.3.tar.gz
Algorithm Hash digest
SHA256 485fc8abe72073683d40a5f6fb55702e82a1e639144e47442c3655ed719f041e
MD5 e7135acd92e6d075d47cb88e7879f313
BLAKE2b-256 2e05ad28f0fb22ab2e6fb2b9277e0ec4083ae1ceada5a894d6bfe037cd186f34

See more details on using hashes here.

File details

Details for the file rescue_gridworld-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for rescue_gridworld-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b77a2b2e969c030d6d0e6a5fe8e7efed3a7e708899e2d50c60e640550eb32f63
MD5 f7dba800c1a5fe1bf3e11bb5a75be8df
BLAKE2b-256 0119e4251304c6f02ae0918d2ec84236d079393de4534440ff9b9e903c49339a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page