Port of David Ha's Slime volleyball for gymnasium.

Project description

Slime Volleyball

A Gymnasium-compatible multi-agent environment for the classic Slime Volleyball game, with dual numpy/JAX backends. Originally developed by David Ha and ported from hardmaru/slimevolleygym.

Two agents (agent_left, agent_right) play volleyball. The game ends when one agent loses all lives (default: 1) or after max_steps (default: 3000) timesteps.

Installation

pip install -e .

Dependencies: numpy, gymnasium

Optional: jax / jaxlib (JAX backend), pyglet (rendering), opencv-python (pixel observations)

Quick Start

Multi-Agent (PettingZoo-style)

from slime_volleyball.slimevolley_env import SlimeVolleyEnv

env = SlimeVolleyEnv()
obs, info = env.reset(seed=42)

for _ in range(1000):
    actions = {"agent_left": env.action_space["agent_left"].sample(),
               "agent_right": env.action_space["agent_right"].sample()}
    obs, rewards, terminateds, truncateds, info = env.step(actions)
    if terminateds["__all__"] or truncateds["__all__"]:
        obs, info = env.reset()

Single-Agent (Right Agent vs Baseline)

When you pass a single integer action, the right agent is controlled by that action and the left agent uses the built-in baseline RNN policy:

env = SlimeVolleyEnv()
obs, info = env.reset(seed=42)

obs, rewards, terminateds, truncateds, info = env.step(3)  # right agent jumps

Boost Mode

from slime_volleyball.slimevolley_boost_env import SlimeVolleyBoostEnv

env = SlimeVolleyBoostEnv()
obs, info = env.reset(seed=42)
# 16-dim observations, Discrete(13) actions (adds boost/powerup variants)

Dual Backend

The environment supports two backends selected at construction time. When backend="numpy" (default), no JAX code runs and JAX does not need to be installed. When backend="jax", the environment uses jax.numpy for all array operations and exposes JIT-compiled functions for high-performance training.

# NumPy backend (default) — no JAX required
env = SlimeVolleyEnv()

# JAX backend
env = SlimeVolleyEnv(config={"backend": "jax"})

Internally, array operations go through a backend proxy (from slime_volleyball.backend import xp) that resolves to numpy or jax.numpy depending on the active backend. The entire game physics and observation pipeline is implemented as pure functions operating on an immutable EnvState dataclass.

JAX Functional API

With backend="jax", the environment exposes jax_step and jax_reset — JIT-compiled pure functions compatible with jax.vmap and jax.lax.scan for vectorized training (JaxMARL-style).

import jax
import jax.numpy as jnp
from slime_volleyball.slimevolley_env import SlimeVolleyEnv

env = SlimeVolleyEnv(config={"backend": "jax"})
env.reset(seed=0)

step_fn = env.jax_step    # (EnvState, actions) -> (obs, state, rewards, terms, truncs, infos)
reset_fn = env.jax_reset   # (rng_key) -> (obs, state, infos)

Vectorized Environments (vmap)

num_envs = 64
keys = jax.random.split(jax.random.PRNGKey(0), num_envs)

# Parallel reset
obs, states, _ = jax.vmap(reset_fn)(keys)
# obs.shape == (64, 2, 12), one obs per agent per env

# Parallel step
actions = jnp.zeros((num_envs, 2, 3), dtype=jnp.float32)
obs, states, rewards, terms, truncs, _ = jax.vmap(step_fn)(states, actions)

Rollouts with lax.scan

def rollout_step(state, _):
    actions = jnp.zeros((2, 3), dtype=jnp.float32)
    obs, new_state, rewards, _, _, _ = step_fn(state, actions)
    return new_state, rewards

single_state = jax.tree_util.tree_map(lambda x: x[0], states)
final_state, all_rewards = jax.lax.scan(rollout_step, single_state, None, length=1000)
# all_rewards.shape == (1000, 2)

Observations

Observations are perspective-normalized: each agent sees the world as if it's on a consistent side (x-values and velocities are reflected by the agent's direction).

Base (12-dim): [x, y, vx, vy, ball_x, ball_y, ball_vx, ball_vy, opp_x, opp_y, opp_vx, opp_vy] — all divided by a scale factor of 10.

Boost (16-dim): Appends [powerup_avail, powerup_timer, opp_powerup_avail, opp_powerup_timer].

Actions

Base — Discrete(6):

Index	Action	`[fwd, bwd, jump]`
0	Noop	`[0, 0, 0]`
1	Left	`[1, 0, 0]`
2	UpLeft	`[1, 0, 1]`
3	Up	`[0, 0, 1]`
4	UpRight	`[0, 1, 1]`
5	Right	`[0, 1, 0]`

Boost — Discrete(13): Indices 0–5 as above, indices 6–12 repeat with boost activated ([fwd, bwd, jump, 1]).

The functional step pipeline takes continuous [fwd, bwd, jump] arrays directly (shape (2, 3) or (2, 4) with boost). The discrete-to-continuous mapping is handled by the environment wrapper.

Configuration

Pass a config dict to the environment constructor:

env = SlimeVolleyEnv(config={
    "backend": "numpy",      # "numpy" or "jax"
    "max_steps": 3000,       # max timesteps per episode
    "survival_reward": False, # add +0.01 reward per step survived
    "from_pixels": False,     # pixel observations (numpy only)
    "human_inputs": False,    # invert left agent controls for human play
    "seed": None,             # initial RNG seed
})

EnvState

All mutable game state is captured in a single frozen dataclass (EnvState). State transitions use dataclasses.replace() to produce new states. On the JAX backend, EnvState is registered as a pytree for jit/vmap/scan compatibility.

Field	Shape	Description
`ball_pos`	`(2,)`	Ball position `[x, y]`
`ball_vel`	`(2,)`	Ball velocity `[vx, vy]`
`ball_prev_pos`	`(2,)`	Previous ball position
`agent_pos`	`(2, 2)`	Agent positions `[[x,y], [x,y]]`
`agent_vel`	`(2, 2)`	Agent velocities
`agent_desired_vel`	`(2, 2)`	Desired velocities from actions
`agent_life`	`(2,)`	Lives remaining per agent
`agent_powerup_avail`	`(2,)`	Powerups available
`agent_powerup_timer`	`(2,)`	Powerup countdown timer
`delay_life`	`()`	DelayScreen frames remaining
`time`	`()`	Timestep counter
`done`	`()`	Episode done flag
`rng_key`	—	JAX PRNGKey or numpy Generator

Project Structure

slime_volleyball/
├── slimevolley_env.py         # SlimeVolleyEnv (Gymnasium wrapper)
├── slimevolley_boost_env.py   # SlimeVolleyBoostEnv (boost/powerup variant)
├── baseline_policy.py         # 120-parameter RNN baseline opponent
├── rendering.py               # Pyglet-based 2D rendering
├── backend/
│   ├── __init__.py            # xp proxy (numpy or jax.numpy)
│   ├── _dispatch.py           # set_backend(), get_backend(), BackendProxy
│   ├── array_ops.py           # set_at() for JAX .at[].set() vs numpy indexing
│   └── env_state.py           # EnvState frozen dataclass + pytree registration
├── core/
│   ├── constants.py           # Physics constants, rendering dimensions
│   ├── physics.py             # Pure functional physics (agents, ball, collisions)
│   ├── observations.py        # Perspective-normalized observation computation
│   ├── step_pipeline.py       # Functional reset/step + factory builders
│   ├── game.py                # Legacy OOP game engine (rendering only)
│   ├── agent.py               # Legacy Agent class (rendering only)
│   ├── objects.py             # Particle, Wall, RelativeState (rendering only)
│   └── utils.py               # Coordinate conversion, DelayScreen
tests/
├── test_backend.py            # Backend dispatch tests
├── test_env_state.py          # EnvState + JAX pytree tests
├── test_physics.py            # Physics function tests
├── test_observations.py       # Observation computation tests
├── test_step_pipeline.py      # Functional reset/step tests (numpy + JAX)
├── test_env_wrapper.py        # Environment wrapper tests (numpy, JAX, boost)
├── test_rendering.py          # Rendering state sync tests
└── test_cross_backend.py      # Cross-backend parity tests
examples/
├── train_slimevolley_jax.py   # IPPO training script (JAX)
├── episode.gif                # Trained policy visualization
├── policy.onnx                # ONNX-exported trained policy
└── slimevolley_training.png   # Training reward curve

Training

An IPPO (Independent PPO) training script is provided in examples/train_slimevolley_jax.py. It uses the JAX functional API with jax.vmap for parallel environments and jax.lax.scan for the training loop.

python examples/train_slimevolley_jax.py

This produces:

slimevolley_training.png — reward curve over training
episode.gif — GIF of the trained policy playing
policy.onnx — ONNX export of the trained actor network

Interactive Visualization

python visualize_env.py

Uses the legacy OOP game engine with pyglet rendering. Keyboard controls:

Key	Agent	Action
Left Arrow	Right agent	Left
Right Arrow	Right agent	Right
Up Arrow	Right agent	Jump
A	Left agent	Left
D	Left agent	Right
W	Left agent	Jump

Running Tests

pytest tests/

Credits

Original game: David Ha (blog post)
Original gym environment: hardmaru/slimevolleygym

Project details

Release history Release notifications | RSS feed

This version

0.1.1

Mar 8, 2026

0.1.0

Jan 3, 2025

0.0.4

Dec 25, 2025

0.0.3

Dec 23, 2024

0.0.2

Dec 23, 2024

0.0.1

Dec 22, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slimevb-0.1.1.tar.gz (57.8 kB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

slimevb-0.1.1-py3-none-any.whl (42.7 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file slimevb-0.1.1.tar.gz.

File metadata

Download URL: slimevb-0.1.1.tar.gz
Upload date: Mar 8, 2026
Size: 57.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for slimevb-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`510fbd2a7d7cd9336d990f0a2bf09c0e005fe5282d04efb4a848fa77488d49e4`
MD5	`ac80cdcec792656775b7d6bb13bfcc72`
BLAKE2b-256	`3cf948ac81d49584b910dc295dc035e3b595884cb8d1ee59249e5e42a244e291`

See more details on using hashes here.

File details

Details for the file slimevb-0.1.1-py3-none-any.whl.

File metadata

Download URL: slimevb-0.1.1-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 42.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for slimevb-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba8ba977fa753c05278b77f979302b9f17a15db2f87df2ec40e7d408309b822e`
MD5	`3ce8ea786755116aba9fff507ca91096`
BLAKE2b-256	`6b0b976eaf10e1778883e66b02c9d3727e0cd315a0de933010962c35112af05e`

See more details on using hashes here.

slimevb 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Slime Volleyball

Installation

Quick Start

Multi-Agent (PettingZoo-style)

Single-Agent (Right Agent vs Baseline)

Boost Mode

Dual Backend

JAX Functional API

Vectorized Environments (vmap)

Rollouts with lax.scan

Observations

Actions

Configuration

EnvState

Project Structure

Training

Interactive Visualization

Running Tests

Credits

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes