Multi-Agent Partially Observable gridworlds in JAX
Project description
MAPOX
MAPOX is a small collection of JAX-native, multi-agent, partially-observable gridworld environments with a shared observation/action format and a simple pygame renderer.
The environments are functional (state in / state out), designed to work well with jax.jit/jax.vmap, and expose an action_mask for per-environment action subsets.
These environments were originally devoloped as part of https://github.com/gabe00122/jaxrl/tree/mapox and this might be a good example for trained polices.
Installation
MAPOX requires Python 3.11+.
uv add mapox
Notes:
- Video export uses
python-ffmpegand requires theffmpegbinary available on your system PATH.
Quick start
Environments implement a small interface:
reset(rng_key) -> (state, timestep)step(state, actions, rng_key) -> (state, timestep)timestep.obs:(num_agents, view_w, view_h, 4)int8timestep.action_mask:(num_agents, num_actions)bool
import jax
import jax.numpy as jnp
from mapox import EnvironmentFactory, FindReturnConfig
factory = EnvironmentFactory()
env, _ = factory.create_env(FindReturnConfig(num_agents=2), length=512)
rng = jax.random.PRNGKey(0)
state, ts = env.reset(rng)
# Sample a random *legal* action per agent using the action mask
rng, akey, skey = jax.random.split(rng, 3)
logits = jax.random.uniform(akey, (env.num_agents, env.action_spec.n))
actions = jnp.argmax(jnp.where(ts.action_mask, logits, -1e9), axis=-1)
state, ts = env.step(state, actions, skey)
print(ts.reward, ts.terminated)
One-hot encoding observations
Observations are compact categorical channels. You can expand them into one-hot features:
from mapox import concat_one_hot
# sizes is typically (NUM_TILE_TYPES, 5, 3, 3)
sizes = env.observation_spec.max_value
x = concat_one_hot(ts.obs, sizes) # (..., sum(sizes))
Interactive play / rendering
A simple interactive runner is provided:
uv run -m mapox.play --env king_hill
Controls (depending on the environment):
- Movement:
WASDor arrow keys n: cycle focused agent
The renderer can show the full map or the focused agent’s POV (a local crop). See mapox/play.py for an example of using GridworldClient and recording video.
Observation & action format
All environments share a unified discrete encoding defined in mapox/envs/constance.py.
Actions
The action space is always DiscreteActionSpec(n=7) with IDs:
| id | action |
|---|---|
| 0 | move up |
| 1 | move right |
| 2 | move down |
| 3 | move left |
| 4 | stay |
| 5 | primary action |
| 6 | dig action |
Not every environment uses every action. Always consult timestep.action_mask before sampling.
Observation
Each agent receives a local crop centered on itself:
(view_width, view_height, 4) with channels:
tile_id(terrain + agent types)direction(0 = none, 1..4 = cardinal direction)team_id(0 = none/neutral, 1 = red, 2 = blue)health(0..2)
Environments
All environment configs are Pydantic models and can be created through EnvironmentFactory.
- Find & Return (
FindReturnEnv,FindReturnConfig)
Search for goal tiles in a procedurally-generated map. When an agent finds a flag it is rewarded and respawned elsewhere.
https://github.com/user-attachments/assets/98cc3318-67ca-44c0-ac6e-e4537bd30ed1
- Scouts (
ScoutsEnv,ScoutsConfig)
Two roles: Harvesters “unlock” treasure tiles; Scouts collect unlocked treasures.
https://github.com/user-attachments/assets/d566840e-1837-4fc1-8c78-439677f358a8
- Traveling Salesman (
TravelingSalesmanEnv,TravelingSalesmanConfig)
Multiple flags are scattered. Each agent is rewarded the first time it reaches each flag; flags reset after all are collected.
https://github.com/user-attachments/assets/af009d24-c65e-4195-99af-0a4e703652cd
- King of the Hill (
KingHillEnv,KingHillConfig)
Two-team competitive environment with knights/archers, destructible walls, control points, and team-shared rewards.
https://github.com/user-attachments/assets/3483745f-7c53-46e9-b838-3cc76b9e3ee4
Wrappers
-
VectorWrapper(env, vec_count)
Runsvec_countindependent copies of an environment viajax.vmapand flattens the(vec, agent)dimensions into a single agent dimension. -
MultiTaskWrapper((env1, env2, ...), (name1, name2, ...))
Combines multiple environments into one by concatenating their agents. Adds a per-agenttask_idsfield to theTimeStepviaTaskIdWrapper.
The EnvironmentFactory also supports a MultiTaskConfig that can build a multitask environment (and optionally vectorize each task).
Acknowledgements
Rendering uses the Urizen Onebit Tileset by Vurmux.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mapox-0.2.0.tar.gz.
File metadata
- Download URL: mapox-0.2.0.tar.gz
- Upload date:
- Size: 309.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"TUXEDO OS","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3c6fecefbc2c8541c36d339085005f4dac589b96d1099a11a6fd982dfda3783
|
|
| MD5 |
08872225f3af51ff68b5b4bab1c3fe16
|
|
| BLAKE2b-256 |
0f766fbbab7b59f44c73f5cb0334a56817bc50ce6ea8bd579f6c1f5ee1d086ee
|
File details
Details for the file mapox-0.2.0-py3-none-any.whl.
File metadata
- Download URL: mapox-0.2.0-py3-none-any.whl
- Upload date:
- Size: 281.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"TUXEDO OS","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61cb1aa2660316cae40ed008fe8f4c719ba35689ae0715a8f28fbda34081f6cb
|
|
| MD5 |
1496db77799f23009e0e28d6ccb6ee6e
|
|
| BLAKE2b-256 |
b8e2fba65717204fc58361c18a705b04fd9caa2caba98c03ca5e255ee1b6eb41
|