Memory Maze is an environment to benchmark memory abilities of RL agents

Project description

memory-maze

Memory Maze environment for RL based on dm_control.

Task

Memory Maze is a task designed to test the memory abilities of RL agents.

The task is based on a game known as Scavenger Hunt (or Treasure Hunt). The agent starts in a randomly generated maze, which contains a number of landmarks of different colors. Agent is prompted to find the target landmark of a specific color, indicated by the border color in the observation image. Once the agent successfully finds and touches the correct landmark, it gets a +1 reward and the next random landmark is chosen as a target. If the agent touches the landmark of the wrong color, there is no effect. Throughout the episode the maze layout and the locations of the landmarks do not change. The episode continues for a fixed amount of time, and so the total episode reward is equal to the number of targets the agent can find in the given time.

Memory Maze tests the memory of the agent in a clean and direct way, because an agent with perfect memory will only have to explore the maze once (which is possible in a time much shorter than the length of episode) and then just follow the shortest path to the target, whereas an agent with no memory will have to randomly wonder through the maze to find each target.

There are 4 size variations of the maze. The largest maze 15x15 is designed to be challenging but solvable for humans (see benchmark results below), but out of reach for the state-of-the-art RL methods. The smaller sizes are provided as stepping stones, with 9x9 solvable with current RL methods.

Size	Landmarks	Episode steps	env_id
9x9	3	1000	`MemoryMaze-9x9-v0`
11x11	4	2000	`MemoryMaze-11x11-v0`
13x13	5	3000	`MemoryMaze-13x13-v0`
15x15	6	4000	`MemoryMaze-15x15-v0`

Note that the mazes are generated with labmaze, the same algorithm as used by DmLab-30. In particular, 9x9 corresponds to the small variant and 15x15 corresponds to the large variant.

map-9x9 map-11x11 map-13x13 map-15x15
Examples of generated mazes for 4 different sizes.

Installation

The environment is available as a pip package

pip install git+https://github.com/jurgisp/memory-maze.git#egg=memory-maze

It will automatically install dm_control and mujoco dependencies.

Gym interface

Once pip package is installed, the environment can be created using Gym interface

!pip install gym
import gym

env = gym.make('memory_maze:MemoryMaze-9x9-v0')
env = gym.make('memory_maze:MemoryMaze-11x11-v0')
env = gym.make('memory_maze:MemoryMaze-13x13-v0')
env = gym.make('memory_maze:MemoryMaze-15x15-v0')

This default environment has dictionary observation space (TODO: map, targets)

>>> env.observation_space
Dict(image: Box(0, 255, (64, 64, 3), uint8))

In order to make an environment with pure image observation, which may be expected by default RL implementations, add the -Img-v0 suffix to the env id:

env = gym.make('memory_maze:MemoryMaze-9x9-Img-v0')

There are other helper variations of the environment, see here.

dm_env interface

We also provide dm_env API implementation:

from memory_maze import tasks

env = tasks.memory_maze_9x9()
env = tasks.memory_maze_11x11()
env = tasks.memory_maze_13x13()
env = tasks.memory_maze_15x15()

The observation is a dictionary, which includes image observation (TODO: map, targets)

>>> env.observation_spec()
{
  'image': BoundedArray(shape=(64, 64, 3), ...)
}

The constructor accepts a number of arguments, which can be used to tweak the environment for debugging:

env = tasks.memory_maze_9x9(
    control_freq=4,
    discrete_actions=True,
    target_color_in_image=True,
    image_only_obs=False,
    top_camera=False,
    good_visibility=False,
    camera_resolution=64
)

GUI

There is also a graphical UI provided, which can be launched as:

pip install gym pygame pillow imageio

# The default view, that the agent sees
python gui/run_gui.py --fps=6 --env "memory_maze:MemoryMaze-15x15-v0"

# Higher resolution and higher control frequency, nicer for human control
python gui/run_gui.py --fps=60 --env "memory_maze:MemoryMaze-15x15-HiFreq-HD-v0"

Observation space, Action space

Benchmarks

Oracle scores

Human scores

Project details

Release history Release notifications | RSS feed

1.0.3

Jun 20, 2023

1.0.2

Oct 26, 2022

1.0.1

Oct 26, 2022

This version

1.0.0

Oct 19, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memory-maze-1.0.0.tar.gz (15.3 kB view details)

Uploaded Oct 19, 2022 Source

Built Distribution

memory_maze-1.0.0-py3-none-any.whl (15.3 kB view details)

Uploaded Oct 19, 2022 Python 3

File details

Details for the file memory-maze-1.0.0.tar.gz.

File metadata

Download URL: memory-maze-1.0.0.tar.gz
Upload date: Oct 19, 2022
Size: 15.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for memory-maze-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`56362037342b2883f8592785dcf304d44950a0c103feac520a9c9aef52b0621a`
MD5	`1f46e5ba0059a9e2940c219b37401f01`
BLAKE2b-256	`ee20416a6390eea27cf5753655482fd2f0efcd8e8091faad7118a526c4808356`

See more details on using hashes here.

File details

Details for the file memory_maze-1.0.0-py3-none-any.whl.

File metadata

Download URL: memory_maze-1.0.0-py3-none-any.whl
Upload date: Oct 19, 2022
Size: 15.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for memory_maze-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`befd8c68a95da41fee406f29fbc6f74ea64ef621d2141037be1ebdce064d5016`
MD5	`1c0637818afb53df2d5059e25c2a2901`
BLAKE2b-256	`d7660a0b68d4160d3a8c00eb0decfae9fa5d402774de094fca52d2e29b7ea9b2`

See more details on using hashes here.

memory-maze 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

memory-maze

Task

Installation

Gym interface

dm_env interface

GUI

Observation space, Action space

Benchmarks

Oracle scores

Human scores

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes