An open-ended space of 2D physics-based RL environments

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Update: We've released a large offline dataset of expert trajectories here!

Kinetix

Kinetix is a framework for reinforcement learning in a 2D rigid-body physics world, written entirely in JAX. Kinetix can represent a huge array of physics-based tasks within a unified framework. We use Kinetix to investigate the training of large, general reinforcement learning agents by procedurally generating millions of tasks for training. You can play with Kinetix in our online editor, or have a look at the JAX physics engine and graphics library we made for Kinetix. Finally, see our docs for more information and more in-depth examples.

The above shows specialist agents trained on their respective levels.

📊 Paper TL; DR

We train a general agent on millions of procedurally generated physics tasks. Every task has the same goal: make the green and blue touch, without green touching red. The agent can act through applying torque via motors and force via thrusters.

The above shows a general agent zero-shotting unseen randomly generated levels.

We then investigate the transfer capabilities of this agent to unseen handmade levels. We find that the agent can zero-shot simple physics problems, but still struggles with harder tasks.

The above shows a general agent zero-shotting unseen handmade levels.

📜 Basic Usage

Kinetix follows the interface established in gymnax:

# Use default parameters
env_params = EnvParams()
static_env_params = StaticEnvParams()

# Create the environment
env = make_kinetix_env(
  observation_type=ObservationType.PIXELS,
  action_type=ActionType.CONTINUOUS,
  reset_fn=make_reset_fn_sample_kinetix_level(env_params, static_env_params),
  env_params=env_params,
  static_env_params=static_env_params,
)

# Reset the environment state (this resets to a random level)
_rngs = jax.random.split(jax.random.PRNGKey(0), 3)

obs, env_state = env.reset(_rngs[0], env_params)

# Take a step in the environment
action = env.action_space(env_params).sample(_rngs[1])
obs, env_state, reward, done, info = env.step(_rngs[2], env_state, action, env_params)

# Render environment
renderer = make_render_pixels(env_params, env.static_env_params)

pixels = renderer(env_state)

plt.imshow(pixels.astype(jnp.uint8).transpose(1, 0, 2)[::-1])
plt.show()

⬇️ Installation

To install Kinetix (tested with python3.10):

git clone https://github.com/FlairOx/Kinetix.git
cd Kinetix
pip install -e ".[dev]"
pre-commit install

Please see here to install jax for your accelerator.

[!TIP] Setting export JAX_COMPILATION_CACHE_DIR="$HOME/.jax_cache" in your ~/.bashrc helps improve usability by caching the jax compiles.

Kinetix is also available on PyPi, and can be installed using pip install kinetix-env

🎯 Editor

We recommend using the KinetixJS editor, but also provide a native (less polished) Kinetix editor.

To open this editor run the following command.

python3 kinetix/editor.py

The controls in the editor are:

Move between edit and play modes using spacebar
In edit mode, the type of edit is shown by the icon at the top and is changed by scrolling the mouse wheel. For instance, by navigating to the rectangle editing function you can click to place a rectangle.
- You can also press the number keys to cycle between modes.
To open handmade levels press ctrl-O and navigate to the ones in the L folder.
When playing a level use the arrow keys to control motors and the numeric keys (1, 2) to control thrusters.

📈 Experiments

We have three primary experiment files,

SFL: Training on levels with high learnability, this is how we trained our best general agents.
PLR PLR/DR/ACCEL in the JAXUED style.
PPO Normal PPO in the PureJaxRL style.

To run experiments with default parameters run any of the following:

python3 experiments/sfl.py
python3 experiments/plr.py
python3 experiments/ppo.py

python3 experiments/plr.py ued.replay_prob=0 # for DR

We use hydra for managing our configs. See the configs/ folder for all the hydra configs that will be used by default, or the docs. If you want to run experiments with different configurations, you can either edit these configs or pass command line arguments as follows:

python3 experiments/sfl.py model.transformer_depth=8

These experiments use wandb for logging by default.

[!Note] Experiments tend to run faster when you have JAX's persistent compilation cache enabled, and you can set it, for instance, as export JAX_COMPILATION_CACHE_DIR=.jax_cache

🏋️ Training RL Agents

We provide several different ways to train RL agents, with the three most common options being, (a) Training an agent on random levels, (b) Training an agent on a single, hand-designed level or (c) Training an agent on a set of hand-designed levels.

[!WARNING] Kinetix has three different environment sizes, s, m and l. When running any of the scripts, you have to set the env_size option accordingly, for instance, python3 experiments/ppo.py train_levels=random env_size=m would train on random m levels. It will give an error if you try and load large levels into a small env size, for instance python3 experiments/ppo.py train_levels=m env_size=s would error.

Training on random levels

This is the default option, but we give the explicit command for completeness

python3 experiments/ppo.py train_levels=random

Training on a single hand-designed level

[!NOTE] Check the kinetix/levels/ folder for handmade levels for each size category. By default, the loading functions require a relative path to the kinetix/levels/ directory

python3 experiments/ppo.py train_levels=s train_levels.train_levels_list='["s/h4_thrust_aim.json"]'

Training on a set of hand-designed levels

python3 experiments/ppo.py train_levels=s env_size=s eval=eval_auto
# python3 experiments/ppo.py train_levels=m env_size=m eval=eval_auto
# python3 experiments/ppo.py train_levels=l env_size=l eval=eval_auto

Or, on a custom set:

python3 experiments/ppo.py eval=eval_auto train_levels=l env_size=l train_levels.train_levels_list='["s/h2_one_wheel_car","l/h11_obstacle_avoidance"]'

🗃️ Offline Data & Behavioural Cloning

Kinetix now includes data-loading utilities for training from pre-collected datasets of transitions or trajectories.

Data format

Datasets are stored as zarr archives. Each shard is a zarr array of structured numpy records. Trajectory batches have shape (batch_size, T, *dims) (T=256) and are returned as ActionEnvStateMask objects with the following fields:

Field	Shape	Description
`action`	`(B, T, A)`	Expert action at each timestep
`env_state`	`(B, T, ...)`	Full environment state
`mask`	`(B, T)`	Indicator of successful trajectories. Since this dataset is only comprised of successful trajectories, this is always true.
`action_mask`	`(B, T, A)`	Boolean — which action dimensions are active in this level (motors and thrusters that actually exist)
`done`	`(B, T)`	Episode termination flags

Dataset statistics

Each dataset was generated by training a specialist PPO agent on each level independently, then rolling out the trained policy to collect trajectories. Datasets are named {policy_steps}/{size}, where policy_steps is the number of RL training steps used for each specialist agent and size is the environment size (small, medium, large).

Unique Levels is the number of distinct levels for which trajectories were collected; Transitions is the total number of individual environment steps across all trajectories.

Expert Training Steps	Size	Unique Levels	Transitions	Size on Disk
`1M`	`s`	5.98M	1.53B	123G
`1M`	`m`	3.45M	883.76M	98G
`1M`	`l`	1.05M	268.01M	82G
`10M`	`s`	637.4k	163.18M	12G
`10M`	`m`	422.1k	108.07M	11G
Total		11.54M	2.95B	326G

Downloading the dataset

The dataset is hosted on Hugging Face. You can install the Huggingface CLI using pip install huggingface_hub

Download the entire dataset (~326 GB):

hf download mbeukman/Kinetix-Offline --repo-type dataset --local-dir ./data

Download a single folder (e.g. 1M/m, the medium-size 1M-step split):

hf download mbeukman/Kinetix-Offline --repo-type dataset --local-dir ./data --include "1M/m/*"

Replace 1M/m with any other {policy_steps}/{size} combination from the table above.

Loading data

from kinetix.data import TrajectoryDatasetManager

traj_manager = TrajectoryDatasetManager(
    dataset_dir="/path/to/traj_data",
    batch_size=64,          # number of trajectories per batch
)
traj_batch = traj_manager.load_next_batch() # shape (64, T, *dims)

See examples/example_data_loading.py for a self-contained runnable example.

Offline BC training

experiments/offline_bc.py trains a policy via behavioural cloning on a zarr dataset:

python3 experiments/offline_bc.py dataset_dir=/path/to/data.zarr

Configuration lives in configs/offline_bc.yaml.

🗲 Multi-Device Parallelism

The PPO and SFL scripts now both support transparent multi-GPU training via JAX's shard_map. These scripts automatically parallelise over all available devices, and allows you to do large-scale training as is done here.

num_train_envs corresponds to the total number of environments, and these are evenly divided across devices.

💨 Compilation Speed

Since Kinetix is quite complex, it generally takes quite a long time to compile. In particular, running plr.py or sfl.py may take a long time to get to actually executing code. This can be a burden when you are implementing new features, and just want to debug quickly. To make this easier, we provide two options: train_levels=dummy env.dummy_env=True (e.g. using python experiments/sfl.py train_levels=dummy env.dummy_env=True). These options replace the actual environment step and reset logic with no-ops, meaning that the compilation process will be much faster. However, no logic will be executed, so this is only to check syntax / shape / jax errors, and not to debug learning issues.

❌ Errata

The left wall was erroneously misplaced 5cm to the left in all levels and all experiments in the paper (each level is a square with side lengths of 5 metres). This error has been fixed in the latest version of Jax2D, but we have pinned Kinetix to the old version for consistency and reproducability with the original paper. Further improvements have been made, so if you wish to reproduce the paper's results, please use kinetix version 0.1.0, which is tagged on github.

🔎 See Also

🌐 Kinetix.js Kinetix reimplemented in Javascript, with a live demo here.
🍎 Jax2D The physics engine we made for Kinetix.
👨‍💻 JaxGL The graphics library we made for Kinetix.
📋 Our Paper for more details and empirical results.

🙏 Acknowledgements

The permutation invariant MLP model (enabled by setting model.permutation_invariant_mlp=True) was added by Anya Sims. Thanks to Thomas Foster for fixing some macOS specific issues. We'd also like to thank to Thomas Foster, Alex Goldie, Matthew Jackson, Sebastian Towers and Andrei Lupu for useful feedback.

📚 Citation

If you use Kinetix in your work, please cite it as follows:

@article{matthews2024kinetix,
      title={Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks}, 
      author={Michael Matthews and Michael Beukman and Chris Lu and Jakob Foerster},
      booktitle={The Thirteenth International Conference on Learning Representations},
      year={2025},
      url={https://arxiv.org/abs/2410.23208}
}

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

3.0.0

May 26, 2026

2.0.4

Jan 24, 2026

2.0.3

Oct 11, 2025

2.0.2

Oct 4, 2025

1.0.7

Jul 24, 2025

1.0.6

Jul 8, 2025

1.0.5

Jul 5, 2025

1.0.4 yanked

Jul 5, 2025

1.0.2

Jul 5, 2025

1.0.0

Mar 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kinetix_env-3.0.0.tar.gz (220.8 kB view details)

Uploaded May 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kinetix_env-3.0.0-py3-none-any.whl (282.0 kB view details)

Uploaded May 26, 2026 Python 3

File details

Details for the file kinetix_env-3.0.0.tar.gz.

File metadata

Download URL: kinetix_env-3.0.0.tar.gz
Upload date: May 26, 2026
Size: 220.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for kinetix_env-3.0.0.tar.gz
Algorithm	Hash digest
SHA256	`2e234fb3d38dcef6cc551bbee9ac0489802934081811e3d31a66c1392e43ec9d`
MD5	`980c9db47ce1253af2640ea7d6c0c632`
BLAKE2b-256	`6d1efa326b5a389feebfc2a2d6ae63ae031ba82c3fb2fdfd713024dc5851f0a7`

See more details on using hashes here.

File details

Details for the file kinetix_env-3.0.0-py3-none-any.whl.

File metadata

Download URL: kinetix_env-3.0.0-py3-none-any.whl
Upload date: May 26, 2026
Size: 282.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for kinetix_env-3.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6b7564c6830ee6c2ce9fd7cc9cc1520b2bd047a05cdba9ce6a6e1e78d82abadf`
MD5	`9f3776a88a2d8b48bcc438669b43e6ef`
BLAKE2b-256	`ab53ef7a2134f68731832b2e63917ec1651b0e2c0a4c0c2687ccc48665290cc5`

See more details on using hashes here.

kinetix-env 3.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Update: We've released a large offline dataset of expert trajectories here!

Kinetix

📊 Paper TL; DR

📜 Basic Usage

⬇️ Installation

🎯 Editor

📈 Experiments

🏋️ Training RL Agents

Training on random levels

Training on a single hand-designed level

Training on a set of hand-designed levels

🗃️ Offline Data & Behavioural Cloning

Data format

Dataset statistics

Downloading the dataset

Loading data

Offline BC training

🗲 Multi-Device Parallelism

💨 Compilation Speed

❌ Errata

🔎 See Also

🙏 Acknowledgements

📚 Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes