GPU/TPU-accelerated parallel game simulators for reinforcement learning (RL)
Project description
A collection of GPU/TPU-accelerated parallel game simulators for reinforcement learning (RL)
Why Pgx?
Brax, a JAX-native physics engine, provides extremely high-speed parallel simulation for RL in continuous state space. Then, what about RL in discrete state spaces like Chess, Shogi, and Go? Pgx provides a wide variety of JAX-native game simulators! Highlighted features include:
- ⚡ Super fast in parallel execution on accelerators
- 🎲 Various game support including Backgammon, Chess, Shogi, and Go
- 🖼️ Beautiful visualization in SVG format
Installation
pip install pgx
Usage
Note that all step
functions in Pgx environments are JAX-native., i.e., they are all JIT-able.
import jax
import pgx
env = pgx.make("go_19x19")
init = jax.jit(jax.vmap(env.init)) # vectorize and JIT-compile
step = jax.jit(jax.vmap(env.step))
batch_size = 1024
keys = jax.random.split(jax.random.PRNGKey(42), batch_size)
state = init(keys) # vectorized states
while not (state.terminated | state.truncated).all():
action = model(state.current_player, state.observation, state.legal_action_mask)
state = step(state, action) # state.reward (2,)
Supported games
Backgammon | Chess | Shogi | Go |
---|---|---|---|
Use pgx.available_envs() -> Tuple[EnvId]
to see the list of currently available games. Given an <EnvId>
, you can create the environment via
>>> env = pgx.make(<EnvId>)
You can check the current version of each environment by
>>> env.version
Game/EnvId | Visualization | Version | Five-word description |
---|---|---|---|
2048 "2048" |
beta |
Merge tiles to create 2048. | |
Animal Shogi"animal_shogi" |
v0 |
Animal-themed child-friendly shogi. | |
Backgammon"backgammon" |
beta |
Luck aids bearing off checkers. | |
Chess"chess" |
v0 |
Checkmate opponent's king to win. | |
Connect Four"connect_four" |
v0 |
Connect discs, win with four. | |
Gardner Chess"gardner_chess" |
v0 |
5x5 chess variant, excluding castling. | |
Go"go_9x9" "go_19x19" |
v0 |
Strategically place stones, claim territory. | |
Hex"hex" |
v0 |
Connect opposite sides, block opponent. | |
Kuhn Poker"kuhn_poker" |
beta |
Three-card betting and bluffing game. | |
Leduc hold'em"leduc_holdem" |
beta |
Two-suit, limited deck poker. | |
MinAtar/Asterix"minatar-asterix" |
beta |
Avoid enemies, collect treasure, survive. | |
MinAtar/Breakout"minatar-breakout" |
beta |
Paddle, ball, bricks, bounce, clear. | |
MinAtar/Freeway"minatar-freeway" |
beta |
Dodging cars, climbing up freeway. | |
MinAtar/Seaquest"minatar-seaquest" |
beta |
Underwater submarine rescue and combat. | |
MinAtar/SpaceInvaders"minatar-space_invaders" |
beta |
Alien shooter game, dodge bullets. | |
Othello"othello" |
v0 |
Flip and conquer opponent's pieces. | |
Shogi"shogi" |
v0 |
Japanese chess with captured pieces. | |
Sparrow Mahjong"sparrow_mahjong" |
beta |
A simplified, children-friendly Mahjong. | |
Tic-tac-toe"tic_tac_toe" |
v0 |
Three in a row wins. |
- Bridge Bidding and Mahjong environments are under development 🚧
- Five-word descriptions were generated by ChatGPT 🤖
See also
Pgx is intended to complement these JAX-native environments with (classic) board game suits:
- RobertTLange/gymnax: JAX implementation of popular RL environments (classic control, bsuite, MinAtar, etc) and meta RL tasks
- google/brax: Rigidbody physics simulation in JAX and continuous-space RL tasks (ant, fetch, humanoid, etc)
- instadeepai/jumanji: A suite of diverse and challenging RL environments in JAX (bin-packing, routing problems, etc)
Combining Pgx with these JAX-native algorithms/implementations might be an interesting direction:
- Anakin framework: Highly efficient RL framework that works with JAX-native environments on TPUs
- deepmind/mctx: JAX-native MCTS implementations, including AlphaZero and MuZero
- deepmind/rlax: JAX-native RL components
- google/evojax: Hardware-Accelerated neuroevolution
- RobertTLange/evosax: JAX-native evolution strategy (ES) implementations
- adaptive-intelligent-robotics/QDax: JAX-native Quality-Diversity (QD) algorithms
Citation
@article{koyamada2023pgx,
title={Pgx: Hardware-accelerated parallel game simulation for reinforcement learning},
author={Koyamada, Sotetsu and Okano, Shinri and Nishimori, Soichiro and Murata, Yu and Habara, Keigo and Kita, Haruka and Ishii, Shin},
journal={arXiv preprint arXiv:2303.17503},
year={2023}
}
LICENSE
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.