Skip to main content

Super Mario Bros. for Gymnasium

Project description

gym-super-mario-bros

BuildStatus PackageVersion PythonVersion Stable Format License

A Gymnasium environment for Super Mario Bros., Super Mario Bros. 2 (Lost Levels), and Super Mario Bros. 2 (USA), and Super Mario Bros. 3 on The Nintendo Entertainment System (NES) using the nes-py emulator.

gym-super-mario-bros targets Gymnasium's modern reset, step, render-mode, and truncation semantics. It currently supports CPython 3.13 and 3.14 in CI.

Installation

The preferred installation of gym-super-mario-bros is from pip:

pip install gym-super-mario-bros

Python 3.13 or newer is required. The supported CI targets are CPython 3.13 and 3.14.

Because gym-super-mario-bros depends on the native nes-py emulator, Linux GLIBCXX_* loader errors and Windows compiler-toolchain failures are usually nes-py installation/runtime issues rather than wrapper bugs. See the nes-py installation notes for the current compiler and runtime expectations.

Usage

Python

You must import gym_super_mario_bros before trying to make an environment. This is because Gymnasium environments are registered at runtime. By default, gym_super_mario_bros environments use the full NES action space of 256 discrete actions. To constrain this, gym_super_mario_bros.actions provides three actions lists (RIGHT_ONLY, SIMPLE_MOVEMENT, and COMPLEX_MOVEMENT) for the nes_py.wrappers.JoypadSpace wrapper. See gym_super_mario_bros/actions.py for a breakdown of the legal actions in each of these three lists.

import gymnasium as gym
from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT

env = gym.make('SuperMarioBros-v0', render_mode='human')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
    if done:
        state, info = env.reset(seed=123)
    state, reward, terminated, truncated, info = env.step(env.action_space.sample())
    done = terminated or truncated
    env.render()

env.close()

NOTE: gym_super_mario_bros.make is just an alias to gymnasium.make for convenience after gym_super_mario_bros is imported.

NOTE: registered environments use Gymnasium's TimeLimit wrapper with max_episode_steps=9999999 to preserve the historical cap while allowing the game logic to end normal episodes with terminated=True. Passing a shorter max_episode_steps to gymnasium.make() is the supported way to test or train with external time truncation, which returns truncated=True.

NOTE: remove calls to render in training code for a nontrivial speedup.

Task Metadata

gym_super_mario_bros exposes lightweight task metadata for curriculum, conditioning, and evaluation-matrix code without constructing ROM-backed environments:

import gym_super_mario_bros

tasks = gym_super_mario_bros.all_tasks(single_stage=True)
env_ids = gym_super_mario_bros.task_ids(game_family='smb1')
task = gym_super_mario_bros.task_for_env_id('SuperMarioBros-4-2-v0')
smb3_stages = gym_super_mario_bros.smb3_stage_matrix()

Each MarioTask includes the environment ID, canonical task ID, game family, ROM mode, world/stage, public world label, split flags, and alias metadata for separator-free SMB1 IDs such as SuperMarioBros1-1-v0. The SMB3 stage matrix catalogs the numbered vanilla courses separately from the smaller set of validated single-stage reset entry points.

Command Line

gym_super_mario_bros features a command line interface for playing environments using either the keyboard, or uniform random movement.

python -m gym_super_mario_bros --env <environment ID> --mode <human|random>
gym_super_mario_bros --env <environment ID> --mode <human|random>

NOTE: by default, -e is set to SuperMarioBros-v0 and -m is set to human, --actionspace/-a is set to nes, and rendering is enabled.

Human keyboard play opens a graphical window:

gym_super_mario_bros --env SuperMarioBros-v0 --mode human --actionspace simple

Random play can be rendered or run headlessly. Use --seed to seed the first environment reset:

gym_super_mario_bros --mode random --steps 1000 --no-render --seed 123

Use --actionspace/-a to select nes, right, simple, or complex. Human mode requires rendering, so --mode human --no-render is rejected.

Print the CLI help with:

python -m gym_super_mario_bros --help

Environments

These environments allow 3 attempts (lives) to make it through the 32 stages in the game. The environments only send reward-able game-play frames to agents; No cut-scenes, loading screens, etc. are sent from the NES emulator to an agent nor can an agent perform actions during these instances. If a cut-scene is not able to be skipped by hacking the NES's RAM, the environment will lock the Python process until the emulator is ready for the next action.

Environment Game ROM
SuperMarioBros-v0 SMB standard
SuperMarioBros2-v0 SMB2 standard
SuperMarioBros2USA-v0 SMB2 USA standard
SuperMarioBros3-v0 SMB3 standard

Individual Stages

These environments allow a single attempt (life) to make it through a single stage of the game.

Use the template

SuperMarioBros-<world>-<stage>-v<version>

where:

  • <world> is a number in {1, 2, 3, 4, 5, 6, 7, 8} indicating the world
  • <stage> is a number in {1, 2, 3, 4} indicating the stage within a world
  • <version> is 0 for the standard ROM

For example, to play 4-2 on the standard ROM, you would use the environment id SuperMarioBros-4-2-v0.

Super Mario Bros. 2 (USA) uses the vanilla ROM only. Use SuperMarioBros2USA-v0 for the full game, or the template

SuperMarioBros2USA-<world>-<stage>-v0

where <world> is a number in {1, 2, 3, 4, 5, 6, 7}. Worlds 1 through 6 have stages {1, 2, 3}; world 7 has stages {1, 2}.

Super Mario Bros. 3 uses the vanilla ROM only. Use SuperMarioBros3-v0 for the game, or the currently validated single-stage entry points SuperMarioBros3-1-1-v0, SuperMarioBros3-1-2-v0, SuperMarioBros3-1-4-v0, and SuperMarioBros3-1-6-v0.

The helper gym_super_mario_bros.smb3_stage_matrix() returns catalog metadata for SMB3's numbered courses. Pass validated=True to limit the catalog to single-stage entries that are registered and smoke-tested.

Step

Info about the rewards and info returned by the step method.

Reward Function

The reward function combines dense progress with objective events. Progress is rewarded only when the agent reaches a new best position for the attempt, so tactical backtracking is not punished by the movement term. Score increases, coins, cherries, powerups, health changes, level completion, time pressure, and death penalties are then added when the underlying game exposes reliable RAM counters for that title.

The reward is clipped into the range (-15, 15). The info dictionary also includes reward_components, reward_total_unclipped, and reward_total_clipped so training code can log or choose alternate reward transforms without recomputing RAM-dependent shaping.

info dictionary

The info dictionary returned by the step method contains the following keys:

Key Type Description
coins int The number of collected coins where available
flag_get bool True if Mario reached a flag, ax, or stage-complete state
life int The title-specific displayed life count
score int The cumulative in-game score where available
stage int The current stage
status str Mario's title-specific powerup status
time int The time left on the clock where available
world int The current world
x_pos int Mario's horizontal position where available
y_pos int Mario's vertical position where available
clear bool Cross-game completion flag for the active stage/task
death bool Cross-game death/life-loss flag
game str Normalized game identifier such as smb1 or smb3
game_family str Grouping key for task suites and evaluation matrices
progress int Cross-game progress metric used by the reward function
progress_max int Best progress reached during the current attempt
rom_mode str ROM mode, currently vanilla for registered environments
single_stage bool True for single-stage registered tasks
task_id str Canonical task ID suitable for task conditioning
target_world int or None Configured target world for single-stage tasks
target_stage int or None Configured target stage for single-stage tasks
timeout bool Reserved cross-game timeout flag; external Gymnasium TimeLimit still sets truncated=True
world_label str Public world label, including Lost Levels bonus worlds
reward_components dict Per-step shaped reward terms before clipping
reward_total_unclipped float Per-step shaped reward before reward_range clipping
reward_total_clipped float Per-step shaped reward after reward_range clipping

Newer SMB2 USA and SMB3 environments include additional game-specific keys such as raw transition state, health, lives, map position, powerup timers, P-meter state, invulnerability timers, and progress maxima where those values are available from the ROM's RAM map.

Publishing

PyPI releases are published by the Publish to PyPI GitHub Actions workflow through PyPI trusted publishing, not by local twine credentials. Configure the PyPI project publisher with owner Kautenja, repository gym-super-mario-bros, workflow filename publish.yml, and environment pypi. Publish a release by pushing a tag that matches pyproject.toml's version, with or without a leading v, and then creating the corresponding GitHub release so the workflow can build and upload the distribution artifacts.

Citation

Please cite gym-super-mario-bros if you use it in your research.

@misc{gym-super-mario-bros,
  author = {Christian Kauten},
  howpublished = {GitHub},
  title = {{S}uper {M}ario {B}ros for {O}pen{AI} {G}ym},
  URL = {https://github.com/Kautenja/gym-super-mario-bros},
  year = {2018},
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gym_super_mario_bros-9.1.0.tar.gz (396.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gym_super_mario_bros-9.1.0-py3-none-any.whl (398.2 kB view details)

Uploaded Python 3

File details

Details for the file gym_super_mario_bros-9.1.0.tar.gz.

File metadata

  • Download URL: gym_super_mario_bros-9.1.0.tar.gz
  • Upload date:
  • Size: 396.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gym_super_mario_bros-9.1.0.tar.gz
Algorithm Hash digest
SHA256 d026425bdddec710abd414d404dcc69bfcc535d9a2c72bb18ac21b6f85a0ae58
MD5 3368453ff72f793f0ccb769b364f6efd
BLAKE2b-256 3c633a4222033366f455dc59a5b5b177274985a9005c1f131cac4703987e500b

See more details on using hashes here.

Provenance

The following attestation bundles were made for gym_super_mario_bros-9.1.0.tar.gz:

Publisher: publish.yml on Kautenja/gym-super-mario-bros

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gym_super_mario_bros-9.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gym_super_mario_bros-9.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e61e2b451ed6d841dff3fc5c8238031529c6932f567ebfb32e6dad17b75957e3
MD5 4d985cf1b41fb389425b19071bb7e7a8
BLAKE2b-256 a6ea2f81cec9c3095d746f706f906e33baefd92d833cc4f54d795ce1d6cb457d

See more details on using hashes here.

Provenance

The following attestation bundles were made for gym_super_mario_bros-9.1.0-py3-none-any.whl:

Publisher: publish.yml on Kautenja/gym-super-mario-bros

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page