Skip to main content

Reinforcement learning benchmark that tests all forms of partial observability in JAX.

Project description

POBAX: Partially Observable Benchmarks in JAX


POBAX is a reinforcement learning benchmark that tests all forms of partial observability.

POBAX has been accepted to RLC 2025. Check out our paper!

The benchmark is entirely written in JAX, allowing for fast, GPU-scalable experimentation.

Environments


POBAX includes environments (as well as recommended hyperparameter settings) across diverse forms of partial observability. We list our environments from smallest to largest (in terms of neural network size requirements for PPO RNN):

Environment Category IDs Description
Simple Chain simple_chain Diagnostic POMDP for testing algorithms.
T-Maze tmaze_10 Bakker's classic memory testing environment (hallway disambiguation).
RockSample rocksample_11_11 and rocksample_15_15 The classic rock collecting POMDP, where an agent needs to uncover and collect rocks.
Battleship battleship_10 Single-player battleship (10x10).
Masked Mujoco {env_name}-{F/P/V}-v0 Mujoco with state features masked out. env_name can be Walker, Ant, Hopper, or HalfCheetah. F/P/V stands for fully observable, position only, or velocity only versions of environments, respectively.
DMLab Minigrid Navix-DMLab-Maze-{01/02/03}-v0 MiniGrid versions of the DeepMind Lab mazes. 01/02/03 refer to the DeepMind Lab Minigrid mazes in ascending difficulty.
Visual Continuous Control {env_name}-pixels Pixel-based versions of Mujoco control. Requires the Madrona_MJX package. env_name can be ant, halfcheetah, hopper, or walker2d.
No-Inventory Crafter craftax-pixels Crafter without the inventory. Requires the Craftax package.

Basic Usage

import jax
from pobax.envs import get_env

rand_key = jax.random.PRNGKey(2025)
env_key, rand_key = jax.random.split(rand_key)

# Creates a vectorized environment
env, env_params = get_env("rocksample_11_11", env_key)

# Reset 10 environments
reset_key, rand_key = jax.random.split(rand_key)
reset_keys = jax.random.split(rand_key, 10)

obs, env_state = env.reset(reset_keys, env_params)

# Take steps in all environments
step_key, action_key, rand_key = jax.random.split(rand_key, 3)
step_keys = jax.random.split(step_key, 10)
action_keys = jax.random.split(action_key, 10)

actions = jax.vmap(env.action_space(env_params).sample)(action_keys)

obs, env_state, reward, done, info = env.step(step_keys, env_state, actions, env_params)

Installation

Agents

POBAX includes algorithms loosely based on the PureJAXRL framework, with algorithms based on proximal policy optimization (PPO). These include:

Memoryless versions of the recurrent PPO algorithm is also included with the --memoryless flag.

Citation


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pobax-0.0.1.tar.gz (687.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pobax-0.0.1-py3-none-any.whl (112.6 kB view details)

Uploaded Python 3

pobax-0.0.1-1-py3-non-any.whl (81.0 kB view details)

Uploaded Python 3

File details

Details for the file pobax-0.0.1.tar.gz.

File metadata

  • Download URL: pobax-0.0.1.tar.gz
  • Upload date:
  • Size: 687.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for pobax-0.0.1.tar.gz
Algorithm Hash digest
SHA256 c00b1982509b99fd8040ca1e78786079a5bd28fb78db7ccd740c67755b2fa2b5
MD5 dc77f7b5dfb391ca91ff698df14ffe20
BLAKE2b-256 de546a4ba0c9e0975597ae9f7df1de1751b387f2182a379d5cbadd292690f87b

See more details on using hashes here.

File details

Details for the file pobax-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pobax-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 112.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for pobax-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a50f8a3cf6121cbf9ae47ae3e8e83ba39604c3cdb0fa54a2fb84cdda0a84bf02
MD5 e8e55f5d1df4e1476734661f924ddbe0
BLAKE2b-256 b01afbb1d25afbf8d3aaa5b492f9e5820e780daa22f2956716027dd9cfb42b67

See more details on using hashes here.

File details

Details for the file pobax-0.0.1-1-py3-non-any.whl.

File metadata

  • Download URL: pobax-0.0.1-1-py3-non-any.whl
  • Upload date:
  • Size: 81.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for pobax-0.0.1-1-py3-non-any.whl
Algorithm Hash digest
SHA256 6b4751ee31c523bb55321e25544cee041679f36f24909e5ac27f72b0d9522a4f
MD5 a98cd2140b183b1973fa5ac168fbe64d
BLAKE2b-256 a76bb5d478e8dd43bf9f8e5bdc637e0d2249d67aaa0e1b7b5a54a28e89807b40

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page