Reinforcement learning benchmark that tests all forms of partial observability in JAX.
Project description
POBAX: Partially Observable Benchmarks in JAX
POBAX is a reinforcement learning benchmark that tests all forms of partial observability.
POBAX has been accepted to RLC 2025. Check out our paper!
The benchmark is entirely written in JAX, allowing for fast, GPU-scalable experimentation.
Environments
POBAX includes environments (as well as recommended hyperparameter settings) across diverse forms of partial observability. We list our environments from smallest to largest (in terms of neural network size requirements for PPO RNN):
| Environment | Category | IDs | Description |
|---|---|---|---|
| Simple Chain | simple_chain |
Diagnostic POMDP for testing algorithms. | |
| T-Maze | tmaze_10 |
Bakker's classic memory testing environment (hallway disambiguation). | |
| RockSample | rocksample_11_11 and rocksample_15_15 |
The classic rock collecting POMDP, where an agent needs to uncover and collect rocks. | |
| Battleship | battleship_10 |
Single-player battleship (10x10). | |
| Masked Mujoco | {env_name}-{F/P/V}-v0 |
Mujoco with state features masked out. env_name can be Walker, Ant, Hopper, or HalfCheetah. F/P/V stands for fully observable, position only, or velocity only versions of environments, respectively. |
|
| DMLab Minigrid | Navix-DMLab-Maze-{01/02/03}-v0 |
MiniGrid versions of the DeepMind Lab mazes. 01/02/03 refer to the DeepMind Lab Minigrid mazes in ascending difficulty. |
|
| Visual Continuous Control | {env_name}-pixels |
Pixel-based versions of Mujoco control. Requires the Madrona_MJX package. env_name can be ant, halfcheetah, hopper, or walker2d. |
|
| No-Inventory Crafter | craftax-pixels |
Crafter without the inventory. Requires the Craftax package. |
Basic Usage
import jax
from pobax.envs import get_env
rand_key = jax.random.PRNGKey(2025)
env_key, rand_key = jax.random.split(rand_key)
# Creates a vectorized environment
env, env_params = get_env("rocksample_11_11", env_key)
# Reset 10 environments
reset_key, rand_key = jax.random.split(rand_key)
reset_keys = jax.random.split(rand_key, 10)
obs, env_state = env.reset(reset_keys, env_params)
# Take steps in all environments
step_key, action_key, rand_key = jax.random.split(rand_key, 3)
step_keys = jax.random.split(step_key, 10)
action_keys = jax.random.split(action_key, 10)
actions = jax.vmap(env.action_space(env_params).sample)(action_keys)
obs, env_state, reward, done, info = env.step(step_keys, env_state, actions, env_params)
Installation
Agents
POBAX includes algorithms loosely based on the PureJAXRL framework, with algorithms based on proximal policy optimization (PPO). These include:
- Recurrent PPO,
- λ-discrepancy,
- GTrXL.
Memoryless versions of the recurrent PPO algorithm is also included with the --memoryless flag.
Citation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pobax-0.0.1.tar.gz.
File metadata
- Download URL: pobax-0.0.1.tar.gz
- Upload date:
- Size: 687.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c00b1982509b99fd8040ca1e78786079a5bd28fb78db7ccd740c67755b2fa2b5
|
|
| MD5 |
dc77f7b5dfb391ca91ff698df14ffe20
|
|
| BLAKE2b-256 |
de546a4ba0c9e0975597ae9f7df1de1751b387f2182a379d5cbadd292690f87b
|
File details
Details for the file pobax-0.0.1-py3-none-any.whl.
File metadata
- Download URL: pobax-0.0.1-py3-none-any.whl
- Upload date:
- Size: 112.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a50f8a3cf6121cbf9ae47ae3e8e83ba39604c3cdb0fa54a2fb84cdda0a84bf02
|
|
| MD5 |
e8e55f5d1df4e1476734661f924ddbe0
|
|
| BLAKE2b-256 |
b01afbb1d25afbf8d3aaa5b492f9e5820e780daa22f2956716027dd9cfb42b67
|
File details
Details for the file pobax-0.0.1-1-py3-non-any.whl.
File metadata
- Download URL: pobax-0.0.1-1-py3-non-any.whl
- Upload date:
- Size: 81.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b4751ee31c523bb55321e25544cee041679f36f24909e5ac27f72b0d9522a4f
|
|
| MD5 |
a98cd2140b183b1973fa5ac168fbe64d
|
|
| BLAKE2b-256 |
a76bb5d478e8dd43bf9f8e5bdc637e0d2249d67aaa0e1b7b5a54a28e89807b40
|