Skip to main content

Super Mario Bros. for OpenAI Gym

Project description

gym-super-mario-bros

PackageVersion PythonVersion Stable Format License

Mario

An OpenAI Gym environment for Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The Nintendo Entertainment System (NES) using the nes-py emulator.

Installation

The preferred installation of gym-super-mario-bros is from pip:

pip install gym-super-mario-bros

Usage

Python

You must import gym_super_mario_bros before trying to make an environment. This is because gym environments are registered at runtime. By default, gym_super_mario_bros environments use the full NES action space of 256 discrete actions. To contstrain this, gym_super_mario_bros.actions provides three actions lists (RIGHT_ONLY, SIMPLE_MOVEMENT, and COMPLEX_MOVEMENT) for the nes_py.wrappers.BinarySpaceToDiscreteSpaceEnv wrapper. See gym_super_mario_bros/actions.py for a breakdown of the legal actions in each of these three lists.

from nes_py.wrappers import BinarySpaceToDiscreteSpaceEnv
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = BinarySpaceToDiscreteSpaceEnv(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
    if done:
        state = env.reset()
    state, reward, done, info = env.step(env.action_space.sample())
    env.render()

env.close()

NOTE: gym_super_mario_bros.make is just an alias to gym.make for convenience.

NOTE: remove calls to render in training code for a nontrivial speedup.

Command Line

gym_super_mario_bros feature a command line interface for playing environments using either the keyboard, or uniform random movement.

gym_super_mario_bros -e <the environment ID to play> -m <`human` or `random`>

NOTE: by default, -e is set to SuperMarioBros-v0 and -m is set to human.

Environments

These environments allow 3 attempts (lives) to make it through the 32 levels of the game. The environments only send reward-able game-play frames to agents; No cut-scenes, loading screens, etc. are sent from the NES emulator to an agent nor can an agent perform actions during these occurrences. If a cut-scene is not able to be skipped by hacking the NES's RAM, the environment will lock the Python process until the emulator is ready for the next action.

Environment Game Frameskip ROM Screenshot
SuperMarioBros-v0 SMB 4 standard
SuperMarioBros-v1 SMB 4 downsample
SuperMarioBros-v2 SMB 4 pixel
SuperMarioBros-v3 SMB 4 rectangle
SuperMarioBrosNoFrameskip-v0 SMB 1 standard
SuperMarioBrosNoFrameskip-v1 SMB 1 downsample
SuperMarioBrosNoFrameskip-v2 SMB 1 pixel
SuperMarioBrosNoFrameskip-v3 SMB 1 rectangle
SuperMarioBros2-v0 SMB2 4 standard
SuperMarioBros2-v1 SMB2 4 downsample
SuperMarioBros2NoFrameskip-v0 SMB2 1 standard
SuperMarioBros2NoFrameskip-v1 SMB2 1 downsample

Individual Levels

These environments allow a single attempt (life) to make it through a single level of the game.

Use the template

SuperMarioBros-<world>-<level>-v<version>

where:

  • <world> is a number in {1, 2, 3, 4, 5, 6, 7, 8} indicating the world
  • <level> is a number in {1, 2, 3, 4} indicating the level within a world
  • <version> is a number in {0, 1, 2, 3} specifying the ROM mode to use
    • 0: standard ROM
    • 1: downsampled ROM
    • 2: pixel ROM
    • 3: rectangle ROM
  • NoFrameskip can be added before the first hyphen to disable frame skip

For example, to play 4-2 on the downsampled ROM, you would use the environment id SuperMarioBros-4-2-v1. To disable frame skip you would use SuperMarioBrosNoFrameskip-4-2-v1.

Step

Info about the rewards and info returned by the step method.

Reward Function

The reward function assumes the objective of the game is to move as far right as possible (increase the agent's x value), as fast as possible, without dying. To model this game, three separate variables compose the reward:

  1. v: the difference in agent x values between states
    • in this case this is instantaneous velocity for the given step
    • moving right ⇔ v > 0
    • moving left ⇔ v < 0
    • no movement ⇔ v = 0
  2. c: the difference in the game clock between frames
    • this penalty encourages the agent to move quickly
    • no clock change ⇔ c = 0
    • clock decrease ⇔ c < 0
  3. d: a death penalty that penalizes the agent for dying in a state
    • this penalty encourages the agent to avoid death
    • no clock change ⇔ d = 0
    • death ⇔ d = -15

r = v + c + d

The reward is clipped into the range (-15, 15).

info dictionary

The info dictionary returned by the step method contains the following keys:

Key Type Description
flag_get bool True if Mario reached a flag or ax

Citation

Please cite gym-super-mario-bros if you use it in your research.

@misc{gym-super-mario-bros,
  author = {Christian Kauten},
  title = {{S}uper {M}ario {B}ros for {O}pen{AI} {G}ym},
  year = {2018},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/Kautenja/gym-super-mario-bros}},
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gym_super_mario_bros-3.1.2.tar.gz (196.3 kB view details)

Uploaded Source

Built Distribution

gym_super_mario_bros-3.1.2-py2.py3-none-any.whl (194.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file gym_super_mario_bros-3.1.2.tar.gz.

File metadata

  • Download URL: gym_super_mario_bros-3.1.2.tar.gz
  • Upload date:
  • Size: 196.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/2.7.15

File hashes

Hashes for gym_super_mario_bros-3.1.2.tar.gz
Algorithm Hash digest
SHA256 018c3c920dcd8e6e16f2609c4bc074294524cbffb673ed6cdf4b3474e0c28e16
MD5 ec1d2024ca10972471dc2971471d0813
BLAKE2b-256 46103b8c919657367bf0383495ab4a683baf763ca368d34783e5bb3a8a7b34b1

See more details on using hashes here.

File details

Details for the file gym_super_mario_bros-3.1.2-py2.py3-none-any.whl.

File metadata

  • Download URL: gym_super_mario_bros-3.1.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 194.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/2.7.15

File hashes

Hashes for gym_super_mario_bros-3.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ff2de04f6c5de4dde81819858a6e00f0a374e95e20466e44225b4ffcdc64f065
MD5 c0aeb1c57be1c53b5582fde1fa5ea5f4
BLAKE2b-256 310604e8821a456ba8219d0974730e8e137c7da1a0aa98a120231b81601a718c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page