Skip to main content

Memory Maze is an environment to benchmark memory abilities of RL agents

Project description

memory-maze

Memory Maze environment for RL based on dm_control.

Task

Memory Maze is a task designed to test the memory abilities of RL agents.

The task is based on a game known as Scavenger Hunt (or Treasure Hunt). The agent starts in a randomly generated maze, which contains a number of landmarks of different colors. Agent is prompted to find the target landmark of a specific color, indicated by the border color in the observation image. Once the agent successfully finds and touches the correct landmark, it gets a +1 reward and the next random landmark is chosen as a target. If the agent touches the landmark of the wrong color, there is no effect. Throughout the episode the maze layout and the locations of the landmarks do not change. The episode continues for a fixed amount of time, and so the total episode reward is equal to the number of targets the agent can find in the given time.

Memory Maze tests the memory of the agent in a clean and direct way, because an agent with perfect memory will only have to explore the maze once (which is possible in a time much shorter than the length of episode) and then just follow the shortest path to the target, whereas an agent with no memory will have to randomly wonder through the maze to find each target.

There are 4 size variations of the maze. The largest maze 15x15 is designed to be challenging but solvable for humans (see benchmark results below), but out of reach for the state-of-the-art RL methods. The smaller sizes are provided as stepping stones, with 9x9 solvable with current RL methods.

Size Landmarks Episode steps env_id
9x9 3 1000 MemoryMaze-9x9-v0
11x11 4 2000 MemoryMaze-11x11-v0
13x13 5 3000 MemoryMaze-13x13-v0
15x15 6 4000 MemoryMaze-15x15-v0

Note that the mazes are generated with labmaze, the same algorithm as used by DmLab-30. In particular, 9x9 corresponds to the small variant and 15x15 corresponds to the large variant.

map-9x9   map-11x11   map-13x13   map-15x15
Examples of generated mazes for 4 different sizes.

Installation

The environment is available as a pip package

pip install git+https://github.com/jurgisp/memory-maze.git#egg=memory-maze

It will automatically install dm_control and mujoco dependencies.

Gym interface

Once pip package is installed, the environment can be created using Gym interface

!pip install gym
import gym

env = gym.make('memory_maze:MemoryMaze-9x9-v0')
env = gym.make('memory_maze:MemoryMaze-11x11-v0')
env = gym.make('memory_maze:MemoryMaze-13x13-v0')
env = gym.make('memory_maze:MemoryMaze-15x15-v0')

This default environment has dictionary observation space (TODO: map, targets)

>>> env.observation_space
Dict(image: Box(0, 255, (64, 64, 3), uint8))

In order to make an environment with pure image observation, which may be expected by default RL implementations, add the -Img-v0 suffix to the env id:

env = gym.make('memory_maze:MemoryMaze-9x9-Img-v0')

There are other helper variations of the environment, see here.

dm_env interface

We also provide dm_env API implementation:

from memory_maze import tasks

env = tasks.memory_maze_9x9()
env = tasks.memory_maze_11x11()
env = tasks.memory_maze_13x13()
env = tasks.memory_maze_15x15()

The observation is a dictionary, which includes image observation (TODO: map, targets)

>>> env.observation_spec()
{
  'image': BoundedArray(shape=(64, 64, 3), ...)
}

The constructor accepts a number of arguments, which can be used to tweak the environment for debugging:

env = tasks.memory_maze_9x9(
    control_freq=4,
    discrete_actions=True,
    target_color_in_image=True,
    image_only_obs=False,
    top_camera=False,
    good_visibility=False,
    camera_resolution=64
)

GUI

There is also a graphical UI provided, which can be launched as:

pip install gym pygame pillow imageio

# The default view, that the agent sees
python gui/run_gui.py --fps=6 --env "memory_maze:MemoryMaze-15x15-v0"

# Higher resolution and higher control frequency, nicer for human control
python gui/run_gui.py --fps=60 --env "memory_maze:MemoryMaze-15x15-HiFreq-HD-v0"

Observation space, Action space

Benchmarks

Oracle scores

Human scores

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memory-maze-1.0.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

memory_maze-1.0.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file memory-maze-1.0.0.tar.gz.

File metadata

  • Download URL: memory-maze-1.0.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for memory-maze-1.0.0.tar.gz
Algorithm Hash digest
SHA256 56362037342b2883f8592785dcf304d44950a0c103feac520a9c9aef52b0621a
MD5 1f46e5ba0059a9e2940c219b37401f01
BLAKE2b-256 ee20416a6390eea27cf5753655482fd2f0efcd8e8091faad7118a526c4808356

See more details on using hashes here.

File details

Details for the file memory_maze-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: memory_maze-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for memory_maze-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 befd8c68a95da41fee406f29fbc6f74ea64ef621d2141037be1ebdce064d5016
MD5 1c0637818afb53df2d5059e25c2a2901
BLAKE2b-256 d7660a0b68d4160d3a8c00eb0decfae9fa5d402774de094fca52d2e29b7ea9b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page