Skip to main content

A procedural gridworld maze requiring completion of temporally extended tasks.

Project description

Rescue Gridworld

This environment provides a Search and Rescue gridworld where the objective is to rescue all the people from the maze. To achieve this requires completion of a temporally extended sequence of tasks as follows.

  • Exploration of the maze, opening locked doors to progress. Unlocking a door requires completion of the subtask sequence collect key -> unlock cupboard -> collect keycard -> unlock door.
  • Talk to people discovered so they follow the agent, rescuing them. People are non-stationary within the environment moving randomly every 3 timestaps.
  • The objective is to exit having rescued all the people, but it is also possible to exit without having done so.

Installation

pip install rescue_gridworld

Quick Start

You can run a random agent to test the environment by copying run_random_agent.py to a local directory and running:

python run_random_agent.py --render --steps 1000 --rooms 5

Manual Usage

import gymnasium as gym
import rescue_gridworld

env = gym.make("RescueGridworld-v0", render_mode="human", tile_size=tile_size, num_rooms=5)
obs, info = env.reset()
# ... step the env ...

Action Space

There are 9 actions in the environment:

  • 0: Move up.
  • 1: Move down.
  • 2: Move left.
  • 3: Move right.
  • 4: Collect key.
  • 5: Unlock cupboard.
  • 6: Collect keycard.
  • 7: Unlock door.
  • 8: Talk to person.

Observation Space

The observation provides 2 7x7 "windows".

  • The first is a line-of-sight observation of the local environment.
  • The second is a line-of-sight filtered set of "chain id's".

The Python definitions are as follows to allow the grid to be viewed as an image.:

    self.observation_space = spaces.Dict(
        {
            "grid": spaces.Box(low=0, high=255, shape=(1, 7, 7), dtype=np.uint8),
            "chain_grid": spaces.Box(low=-2, high=500, shape=(1, 7, 7), dtype=np.int16),
        }
    )

Rewards

Reward of 5 is given for completion of subtasks:

  • Collect key.
  • Unlock cupboard.
  • Collect keycard.
  • Unlock door.
  • Talk to person.
  • Exit.

An additional reward of 50 is given if all people have been saved before exiting.

A step penalty of -0.01 is applied to encourage efficiency.

Episode termination conditions

Episodes terminate after max steps, or if the agent moves onto the exit square.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rescue_gridworld-1.0.4.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rescue_gridworld-1.0.4-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file rescue_gridworld-1.0.4.tar.gz.

File metadata

  • Download URL: rescue_gridworld-1.0.4.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for rescue_gridworld-1.0.4.tar.gz
Algorithm Hash digest
SHA256 4b1ac6318f29d902cf323d4375229b57bf4e9c70be8b409dfa4d4cf468644e03
MD5 7be8a65fbbc7834ce9ac2a7e5f817e92
BLAKE2b-256 e7c530dc4646b8925fb9d3825dc8bfd53abf57350e1129d1bfd1cafcc1fd015f

See more details on using hashes here.

File details

Details for the file rescue_gridworld-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for rescue_gridworld-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 34024a78f103b7ef4cd4f22ad7ec1c383e72806c4984601bdadf024d1ff7664d
MD5 224395565a9856b842d685195e6d5316
BLAKE2b-256 12552f94082701745dcdb783b790d9da883e848747e31d62b0424cc2ea3b0b50

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page