Super Mario Bros. for Gymnasium
Project description
gym-super-mario-bros
A Gymnasium environment for Super Mario Bros., Super Mario Bros. 2 (Lost Levels), and Super Mario Bros. 2 (USA), and Super Mario Bros. 3 on The Nintendo Entertainment System (NES) using the nes-py emulator.
gym-super-mario-bros targets Gymnasium's modern reset, step, render-mode,
and truncation semantics. It currently supports CPython 3.13 and 3.14 in CI.
Installation
The preferred installation of gym-super-mario-bros is from pip:
pip install gym-super-mario-bros
Python 3.13 or newer is required. The supported CI targets are CPython 3.13 and 3.14.
Because gym-super-mario-bros depends on the native nes-py emulator, Linux
GLIBCXX_* loader errors and Windows compiler-toolchain failures are usually
nes-py installation/runtime issues rather than wrapper bugs. See the
nes-py installation notes
for the current compiler and runtime expectations.
Usage
Python
You must import gym_super_mario_bros before trying to make an environment.
This is because Gymnasium environments are registered at runtime. By default,
gym_super_mario_bros environments use the full NES action space of 256
discrete actions. To constrain this, gym_super_mario_bros.actions provides
three actions lists (RIGHT_ONLY, SIMPLE_MOVEMENT, and COMPLEX_MOVEMENT)
for the nes_py.wrappers.JoypadSpace wrapper. See
gym_super_mario_bros/actions.py for a
breakdown of the legal actions in each of these three lists.
import gymnasium as gym
from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
env = gym.make('SuperMarioBros-v0', render_mode='human')
env = JoypadSpace(env, SIMPLE_MOVEMENT)
done = True
for step in range(5000):
if done:
state, info = env.reset(seed=123)
state, reward, terminated, truncated, info = env.step(env.action_space.sample())
done = terminated or truncated
env.render()
env.close()
NOTE: gym_super_mario_bros.make is just an alias to gymnasium.make for
convenience after gym_super_mario_bros is imported.
NOTE: registered environments use Gymnasium's TimeLimit wrapper with
max_episode_steps=9999999 to preserve the historical cap while allowing the
game logic to end normal episodes with terminated=True. Passing a shorter
max_episode_steps to gymnasium.make() is the supported way to test or train
with external time truncation, which returns truncated=True.
NOTE: remove calls to render in training code for a nontrivial
speedup.
Task Metadata
gym_super_mario_bros exposes lightweight task metadata for curriculum,
conditioning, and evaluation-matrix code without constructing ROM-backed
environments:
import gym_super_mario_bros
tasks = gym_super_mario_bros.all_tasks(single_stage=True)
env_ids = gym_super_mario_bros.task_ids(game_family='smb1')
task = gym_super_mario_bros.task_for_env_id('SuperMarioBros-4-2-v0')
Each MarioTask includes the environment ID, canonical task ID, game family,
ROM mode, world/stage, public world label, split flags, and alias metadata for
separator-free SMB1 IDs such as SuperMarioBros1-1-v0.
Command Line
gym_super_mario_bros features a command line interface for playing
environments using either the keyboard, or uniform random movement.
python -m gym_super_mario_bros --env <environment ID> --mode <human|random>
gym_super_mario_bros --env <environment ID> --mode <human|random>
NOTE: by default, -e is set to SuperMarioBros-v0 and -m is set to
human, --actionspace/-a is set to nes, and rendering is enabled.
Human keyboard play opens a graphical window:
gym_super_mario_bros --env SuperMarioBros-v0 --mode human --actionspace simple
Random play can be rendered or run headlessly. Use --seed to seed the first
environment reset:
gym_super_mario_bros --mode random --steps 1000 --no-render --seed 123
Use --actionspace/-a to select nes, right, simple, or complex.
Human mode requires rendering, so --mode human --no-render is rejected.
Print the CLI help with:
python -m gym_super_mario_bros --help
Environments
These environments allow 3 attempts (lives) to make it through the 32 stages in the game. The environments only send reward-able game-play frames to agents; No cut-scenes, loading screens, etc. are sent from the NES emulator to an agent nor can an agent perform actions during these instances. If a cut-scene is not able to be skipped by hacking the NES's RAM, the environment will lock the Python process until the emulator is ready for the next action.
| Environment | Game | ROM |
|---|---|---|
SuperMarioBros-v0 |
SMB | standard |
SuperMarioBros2-v0 |
SMB2 | standard |
SuperMarioBros2USA-v0 |
SMB2 USA | standard |
SuperMarioBros3-v0 |
SMB3 | standard |
Individual Stages
These environments allow a single attempt (life) to make it through a single stage of the game.
Use the template
SuperMarioBros-<world>-<stage>-v<version>
where:
<world>is a number in {1, 2, 3, 4, 5, 6, 7, 8} indicating the world<stage>is a number in {1, 2, 3, 4} indicating the stage within a world<version>is 0 for the standard ROM
For example, to play 4-2 on the standard ROM, you would use the environment
id SuperMarioBros-4-2-v0.
Super Mario Bros. 2 (USA) uses the vanilla ROM only. Use
SuperMarioBros2USA-v0 for the full game, or the template
SuperMarioBros2USA-<world>-<stage>-v0
where <world> is a number in {1, 2, 3, 4, 5, 6, 7}. Worlds 1 through 6 have
stages {1, 2, 3}; world 7 has stages {1, 2}.
Super Mario Bros. 3 uses the vanilla ROM only. Use SuperMarioBros3-v0 for
the game, or SuperMarioBros3-1-1-v0 for the validated World 1-1 single-stage
entry point.
Step
Info about the rewards and info returned by the step method.
Reward Function
The reward function combines dense progress with objective events. Progress is rewarded only when the agent reaches a new best position for the attempt, so tactical backtracking is not punished by the movement term. Score increases, coins, cherries, powerups, health changes, level completion, time pressure, and death penalties are then added when the underlying game exposes reliable RAM counters for that title.
The reward is clipped into the range (-15, 15).
The info dictionary also includes reward_components,
reward_total_unclipped, and reward_total_clipped so training code can log or
choose alternate reward transforms without recomputing RAM-dependent shaping.
info dictionary
The info dictionary returned by the step method contains the following
keys:
| Key | Type | Description |
|---|---|---|
coins |
int |
The number of collected coins where available |
flag_get |
bool |
True if Mario reached a flag, ax, or stage-complete state |
life |
int |
The title-specific displayed life count |
score |
int |
The cumulative in-game score where available |
stage |
int |
The current stage |
status |
str |
Mario's title-specific powerup status |
time |
int |
The time left on the clock where available |
world |
int |
The current world |
x_pos |
int |
Mario's horizontal position where available |
y_pos |
int |
Mario's vertical position where available |
clear |
bool |
Cross-game completion flag for the active stage/task |
death |
bool |
Cross-game death/life-loss flag |
game |
str |
Normalized game identifier such as smb1 or smb3 |
game_family |
str |
Grouping key for task suites and evaluation matrices |
progress |
int |
Cross-game progress metric used by the reward function |
progress_max |
int |
Best progress reached during the current attempt |
rom_mode |
str |
ROM mode, currently vanilla for registered environments |
single_stage |
bool |
True for single-stage registered tasks |
task_id |
str |
Canonical task ID suitable for task conditioning |
target_world |
int or None |
Configured target world for single-stage tasks |
target_stage |
int or None |
Configured target stage for single-stage tasks |
timeout |
bool |
Reserved cross-game timeout flag; external Gymnasium TimeLimit still sets truncated=True |
world_label |
str |
Public world label, including Lost Levels bonus worlds |
reward_components |
dict |
Per-step shaped reward terms before clipping |
reward_total_unclipped |
float |
Per-step shaped reward before reward_range clipping |
reward_total_clipped |
float |
Per-step shaped reward after reward_range clipping |
Newer SMB2 USA and SMB3 environments include additional game-specific keys such as raw transition state, health, lives, map position, powerup timers, P-meter state, invulnerability timers, and progress maxima where those values are available from the ROM's RAM map.
Publishing
PyPI releases are published by the Publish to PyPI GitHub Actions workflow
through PyPI trusted publishing, not by local twine credentials. Configure
the PyPI project publisher with owner Kautenja, repository
gym-super-mario-bros, workflow filename publish.yml, and environment
pypi. Publish a release by pushing a tag that matches pyproject.toml's
version, with or without a leading v, and then creating the corresponding
GitHub release so the workflow can build and upload the distribution artifacts.
Citation
Please cite gym-super-mario-bros if you use it in your research.
@misc{gym-super-mario-bros,
author = {Christian Kauten},
howpublished = {GitHub},
title = {{S}uper {M}ario {B}ros for {O}pen{AI} {G}ym},
URL = {https://github.com/Kautenja/gym-super-mario-bros},
year = {2018},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gym_super_mario_bros-9.0.0.tar.gz.
File metadata
- Download URL: gym_super_mario_bros-9.0.0.tar.gz
- Upload date:
- Size: 394.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
334b4f13321984365bf640687cfb9161586d1b7ba2866a310224a65b75c9bae5
|
|
| MD5 |
a329ba69db50bab745641bb1b778890a
|
|
| BLAKE2b-256 |
96bcde3121d79cb0d0966cb409b2ade07d8d4dc8e36e782d3c3f7bf0f782ef7b
|
Provenance
The following attestation bundles were made for gym_super_mario_bros-9.0.0.tar.gz:
Publisher:
publish.yml on Kautenja/gym-super-mario-bros
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gym_super_mario_bros-9.0.0.tar.gz -
Subject digest:
334b4f13321984365bf640687cfb9161586d1b7ba2866a310224a65b75c9bae5 - Sigstore transparency entry: 1774652236
- Sigstore integration time:
-
Permalink:
Kautenja/gym-super-mario-bros@9e1c319fd61bca0a64ad13dc881f0622f54d8671 -
Branch / Tag:
refs/tags/9.0.0 - Owner: https://github.com/Kautenja
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9e1c319fd61bca0a64ad13dc881f0622f54d8671 -
Trigger Event:
release
-
Statement type:
File details
Details for the file gym_super_mario_bros-9.0.0-py3-none-any.whl.
File metadata
- Download URL: gym_super_mario_bros-9.0.0-py3-none-any.whl
- Upload date:
- Size: 396.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42de21395176c3a0a39c1dcec868375e9a525b0d75440191f71a2ccc30af12d9
|
|
| MD5 |
0a3c6ac97246d947a5eaf3756abccfef
|
|
| BLAKE2b-256 |
dcba92d38f5eb90dbc3b7097f6f6ad06c1b80d2365674e77f10f8d9520824803
|
Provenance
The following attestation bundles were made for gym_super_mario_bros-9.0.0-py3-none-any.whl:
Publisher:
publish.yml on Kautenja/gym-super-mario-bros
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gym_super_mario_bros-9.0.0-py3-none-any.whl -
Subject digest:
42de21395176c3a0a39c1dcec868375e9a525b0d75440191f71a2ccc30af12d9 - Sigstore transparency entry: 1774652466
- Sigstore integration time:
-
Permalink:
Kautenja/gym-super-mario-bros@9e1c319fd61bca0a64ad13dc881f0622f54d8671 -
Branch / Tag:
refs/tags/9.0.0 - Owner: https://github.com/Kautenja
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9e1c319fd61bca0a64ad13dc881f0622f54d8671 -
Trigger Event:
release
-
Statement type: