Blazing fast SuperMarioBros-Nes environment for RL research.
Project description
🚀 Blazing fast SuperMarioBros-Nes environment for RL research 🍄
SuperMarioBros-Nes-turbo is a blazing-fast vectorized Super Mario Bros NES environment for reinforcement-learning research. It uses a custom Rust NES emulator specialized for SuperMarioBros-Nes mapper 0/NROM, with vectorized stepping on the Rust side so Python crosses into Rust once per batched step. Game-specific preprocessing, including frame skip, grayscale or RGB rendering, cropping, resizing, frame stacking, reward extraction, termination checks, and observation-buffer writes, happens before data returns to Python. It follows the same throughput-first direction as stable-retro-turbo, but drops broad stable-retro compatibility so the emulator and batch API can specialize on Super Mario Bros NES.
Install
git clone https://github.com/tsilva/SuperMarioBros-Nes-turbo.git
cd SuperMarioBros-Nes-turbo
uv sync --extra dev
uv run maturin develop --release
ROM files are not included in this repository. Pass --rom-path to scripts, set SMB_ROM_PATH, or provide rom_path= when constructing environments. Expected SHA-256 for the supported Super Mario Bros NES ROM:
f61548fdf1670cffefcc4f0b7bdcdd9eaba0c226e3b74f8666071496988248de
Import the package as supermariobrosnes_turbo:
import numpy as np
from supermariobrosnes_turbo import Actions, SuperMarioBrosNesTurboVecEnv
env = SuperMarioBrosNesTurboVecEnv(
"SuperMarioBros-Nes-v0",
rom_path="/path/to/SuperMarioBros.nes",
num_envs=64,
use_restricted_actions=Actions.ALL,
frame_skip=4,
obs_grayscale=True,
frame_stack=4,
obs_crop=(32, 0, 0, 0),
obs_resize=(84, 84),
obs_layout="chw",
)
obs = env.reset()
actions = np.zeros((env.num_envs, env.num_buttons), dtype=np.uint8)
env.step_async(actions)
obs, rewards, dones, infos = env.step_wait()
step_wait() follows the Stable Baselines3 VecEnv contract: it calls the Rust FastMarioVecEnv once for the whole batch and returns (obs, rewards, dones, infos) from reusable NumPy arrays. Use step_fast() when you do not need per-env info dictionaries, or step_wait_gymnasium() when you need separate terminated and truncated arrays.
Initial states can be a single stable-retro state, one state per env slot, or a weighted mapping sampled independently for each lane on reset:
env = SuperMarioBrosNesTurboVecEnv(
"SuperMarioBros-Nes-v0",
rom_path="/path/to/SuperMarioBros.nes",
num_envs=16,
state={"Level1-1": 0.5, "Level1-4": 0.5},
done_on={
"life_loss": ("lives", "decrease"),
"level_change": (("levelHi", "levelLo"), "change"),
},
)
env.seed(123)
obs = env.reset()
sampled_states = env.active_states()
Commands
uv sync --extra dev # install Python dev dependencies
uv run maturin develop --release # build and install the Rust extension
make test # Rust tests + HF policy completion/parity oracle
uv run python scripts/smoke_smb.py --rom-path /path/to/SuperMarioBros.nes # quick ROM/emulator smoke check
uv run python scripts/benchmark_vec_env.py --rom-path /path/to/SuperMarioBros.nes --num-envs 8 --frame-skip 4 --frame-stack 4
uv run python scripts/benchmark_sps.py --rom-path /path/to/SuperMarioBros.nes --num-envs 16 --steps 500 --repeats 3
uv run python scripts/play.py --rom-path /path/to/SuperMarioBros.nes --mode external # raw SDL2 play view
uv run python scripts/play.py --rom-path /path/to/SuperMarioBros.nes --mode external --view preprocessed --scale 4
uv run python scripts/play_policy.py https://huggingface.co/tsilva/SuperMarioBros-NES_Level1 --rom-path /path/to/SuperMarioBros.nes
Fixed-host benchmark target
Use stable-retro-turbo==1.0.1.post1 as the Stable Retro PyPI oracle for new benchmarks and comparisons. Rerun the PyPI oracle baseline before quoting a current speedup, so the comparison uses the same SuperMarioBros-Nes-v0 ROM, saved-state set, frame skip, frame stack, grayscale/crop/resize preprocessing, and 16 vector envs on the fixed beast-3 CPU host.
Historical fixed-host results:
| Environment | Version / Ref | Official median env steps/sec | Mean invocation-median env steps/sec | Run-median CV | Notes |
|---|---|---|---|---|---|
SuperMarioBros-Nes-turbo |
main |
47,611.14 |
47,605.89 |
0.28% |
Full official fixed-host run; all validity gates passed. |
stable-retro-turbo PyPI oracle |
1.0.0.post23 |
7,437.65 |
7,440.04 |
0.44% |
Historical only; superseded by 1.0.1.post1 for new comparisons. Statistical gates passed, but the post-run host-load gate failed because the 1-minute load was sampled immediately after the benchmark's own CPU-heavy timing. |
Local benchmark artifact paths:
artifacts/benchmarks/host-results/host-single-2026-07-02-123806-R17c60e1eb88e/aggregate.jsonartifacts/benchmarks/host-results/pypi-stable-retro-turbo/1.0.0.post23/0bcebd32669e8e46/aggregate.json
Notes
- Python
>=3.9and a Rust toolchain are required to build the Maturin extension. - The current emulator scope is SuperMarioBros-Nes mapper 0 NROM.
- The Python package exposes
SuperMarioBrosNesTurboVecEnv,ACTION_MEANINGS,CORE_ACTION_MEANINGS, andACTION_SETS.SuperMarioBrosNesTurboVecEnvsubclasses Stable Baselines3VecEnvwhen SB3 is installed and follows thestable-retro-turboRetroVecEnvconstructor shape. use_restricted_actions=Actions.ALLandActions.FILTEREDconsume per-buttonMultiBinarymasks;Actions.DISCRETEconsumes Stable Retro's 36-way discrete action encoding.scripts/play_policy.pyloads Stable Baselines3 PPO checkpoints from a local.zip, a Hugging Face repo id, or ahttps://huggingface.co/...URL and displays raw RGB gameplay in the SDL2 GUI while feeding the model its preprocessed observation stack. It defaults to a Stable Retro playback backend so public SB3/Hugging Face checkpoints use the preprocessing they were trained with; pass--view preprocessedto inspect the model input or--backend nativewhen checking this repo's fast-env parity. The SB3, PyTorch, and Hugging Face Hub dependencies are included in the repo'suvdev environment.- By default,
scripts/benchmark_sps.pystarts lanes fromLevel1-1,Level1-2,Level1-3, andLevel1-4repeated round-robin. Use--state Level1-1or another stable-retro state to start every lane from one saved level state. Use--states ...to choose a different round-robin state list. In Python,state=accepts a single state name/path/bytes value, a sequence with exactly one state per env, or a weighted mapping such as{"Level1-1": 0.5, "Level1-4": 0.5}. After reset,active_state_indices()andactive_states()report the sampled state for each lane. If needed, pass--state-diror setSUPERMARIOBROSNES_FASTENV_STATE_DIR. - For
SuperMarioBrosNesTurboVecEnv,done_on_infoaccepts named terminal rules like{"life_loss": ("lives", "decrease")}. Supported ops arechange,increase, anddecrease; keys are drawn fromINFO_KEYS. Fired rules are reported ininfo["done_on_info"]withop,keys,prev, andnext. - Stable Retro oracle/playback tooling targets
stable-retro-turbo==1.0.1.post1for new benchmarks and comparisons, and constructs the upstream vector env with the current flat keyword names:maxpool_last_two,noop_reset_max,sticky_action_prob,info_filter,obs_copy, anddone_on. Runtime fired terminal rules are still read frominfo["done_on_info"]. - Benchmark JSON can be written with
scripts/benchmark_sps.py --output-json .... - Play mode uses the native SDL2 library. If SDL2 is not installed or discoverable,
scripts/play.pyexits with an SDL backend error. - ROM files are not included in the repository; use the SHA-256 digest above to confirm test inputs when needed.
Architecture
License
MIT, as declared in pyproject.toml and Cargo.toml.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file supermariobrosnes_turbo-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: supermariobrosnes_turbo-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 373.9 kB
- Tags: CPython 3.9+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a4c1d6789a85363e82aabe2fcc7f3ceaafeea4476fcf9dbee54c66fcbf628ea
|
|
| MD5 |
80d471785c2516f8bab198b1a87518dd
|
|
| BLAKE2b-256 |
c729a5060dada417eb241550fcfefc5532a6e234e5dc6d0d385f0e64f9a015bd
|
File details
Details for the file supermariobrosnes_turbo-0.1.2-cp39-abi3-macosx_14_0_arm64.whl.
File metadata
- Download URL: supermariobrosnes_turbo-0.1.2-cp39-abi3-macosx_14_0_arm64.whl
- Upload date:
- Size: 323.9 kB
- Tags: CPython 3.9+, macOS 14.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08dce3f57fe53ea07c4a96b780db44769eca916fa9e0cf94edd0099bf0ab4683
|
|
| MD5 |
d7f5936b94e36bc6b4c45dce85d599f5
|
|
| BLAKE2b-256 |
3d6776ea374a04e4e6a721bbf11597cf0b6d23d78e97af5833a05b08dea7c1ad
|