A robotics benchmark for physical reasoning.
Project description
KinDER
A physical reasoning benchmark for robotics.
There's growing excitement around large language models and their ability to "reason"—but reasoning isn't just about tokens and text. Robots must reason too: over long horizons, under uncertainty, and with sparse feedback. And unlike purely symbolic systems, robotic reasoning is physical: it's grounded in low-level, continuous state and action spaces. It requires understanding kinematics, geometry, dynamics, contact, force, tool use, and more.
This benchmark is designed for this kind of physical reasoning with robots. We invite researchers to try their best task and motion planning, reinforcement learning, imitation learning, and foundation models approaches. We hope that KinDER will bridge perspectives and foster shared progress toward physically intelligent robots.
Environments
| Environment | Category | Example Environment ID |
|---|---|---|
| ClutteredRetrieval2D | Kinematic2D | kinder/ClutteredRetrieval2D-o10-v0 |
| Motion2D | Kinematic2D | kinder/Motion2D-p5-v0 |
| Obstruction2D | Kinematic2D | kinder/Obstruction2D-o4-v0 |
| PushPullHook2D | Kinematic2D | kinder/PushPullHook2D-v0 |
| ClutteredStorage2D | Kinematic2D | kinder/ClutteredStorage2D-b15-v0 |
| StickButton2D | Kinematic2D | kinder/StickButton2D-b10-v0 |
| DynObstruction2D | Dynamic2D | kinder/DynObstruction2D-o3-v0 |
| DynPushPullHook2D | Dynamic2D | kinder/DynPushPullHook2D-o5-v0 |
| DynPushT2D | Dynamic2D | kinder/DynPushT2D-t1-v0 |
| DynScoopPour2D | Dynamic2D | kinder/DynScoopPour2D-o50-v0 |
| Obstruction3D | Kinematic3D | kinder/Obstruction3D-o4-v0 |
| Packing3D | Kinematic3D | kinder/Packing3D-p3-v0 |
| Table3D | Kinematic3D | kinder/Table3D-o3-v0 |
| Transport3D | Kinematic3D | kinder/Transport3D-o2-v0 |
| BaseMotion3D | Kinematic3D | kinder/BaseMotion3D-v0 |
| Shelf3D | Kinematic3D | kinder/Shelf3D-o10-v0 |
| ConstrainedCupboard3D | Dynamic3D | kinder/ConstrainedCupboard3D-o6-v0 |
| SortClutteredBlocks3D | Dynamic3D | kinder/SortClutteredBlocks3D-o20-sort_the_cluttered_blocks_into_bowls-v0 |
| Rearrange3D | Dynamic3D | kinder/Rearrange3D-o2-put_the_boxed_drink_and_the_can_next_to_the_bowl-v0 |
| SweepSimple3D | Dynamic3D | kinder/SweepSimple3D-o50-sweep_the_blocks_to_the_left_side_of_the_kitchen_island-v0 |
| Dynamo3D | Dynamic3D | kinder/Dynamo3D-o1-v0 |
| Tossing3D | Dynamic3D | kinder/Tossing3D-o1-v0 |
| ScoopPour3D | Dynamic3D | kinder/ScoopPour3D-o10-v0 |
| BalanceBeam3D | Dynamic3D | kinder/BalanceBeam3D-o3-v0 |
| SweepIntoDrawer3D | Dynamic3D | kinder/SweepIntoDrawer3D-o5-v0 |
:zap: Usage Example
Basic Usage (Gym API)
import kinder
kinder.register_all_environments()
env = kinder.make("kinder/Obstruction2D-o3-v0") # 3 obstructions
obs, info = env.reset() # procedural generation
action = env.action_space.sample()
next_obs, reward, terminated, truncated, info = env.step(action)
img = env.render()
Object-Centric States
All environments in KinDER use object-centric states. For example:
from kinder.envs.kinematic2d.obstruction2d import ObjectCentricObstruction2DEnv
env = ObjectCentricObstruction2DEnv(num_obstructions=3)
obs, _ = env.reset(seed=123)
print(obs.pretty_str())
Here, obs is an ObjectCentricState, and the printout is:
############################################################### STATE ###############################################################
type: crv_robot x y theta base_radius arm_joint arm_length vacuum gripper_height gripper_width
----------------- -------- -------- ------- ------------- ----------- ------------ -------- ---------------- ---------------
robot 0.885039 0.803795 -1.5708 0.1 0.1 0.2 0 0.07 0.01
type: rectangle x y theta static color_r color_g color_b z_order width height
----------------- -------- -------- ------- -------- --------- --------- --------- --------- --------- ---------
obstruction0 0.422462 0.100001 0 0 0.75 0.1 0.1 100 0.132224 0.0766399
obstruction1 0.804663 0.100001 0 0 0.75 0.1 0.1 100 0.0805652 0.0955062
obstruction2 0.559246 0.100001 0 0 0.75 0.1 0.1 100 0.12608 0.180172
type: target_block x y theta static color_r color_g color_b z_order width height
-------------------- ------- -------- ------- -------- --------- --------- --------- --------- -------- --------
target_block 1.20082 0.100001 0 0 0.501961 0 0.501961 100 0.138302 0.155183
type: target_surface x y theta static color_r color_g color_b z_order width height
---------------------- -------- --- ------- -------- --------- --------- --------- --------- -------- --------
target_surface 0.499675 0 0 1 0.501961 0 0.501961 101 0.180286 0.1
#####################################################################################################################################
For compatibility with baselines, the observations provided by the main environments are vectors. It is easy to convert between vectors and object-centric states. For example:
import kinder
kinder.register_all_environments()
env = kinder.make("kinder/Obstruction2D-o3-v0")
vec_obs, _ = env.reset(seed=123)
object_centric_obs = env.observation_space.devectorize(vec_obs)
recovered_vec_obs = env.observation_space.vectorize(object_centric_obs)
:muscle: Challenges for Existing Approaches
What makes KinDER challenging?
For Reinforcement Learning
Environments have long horizons and sparse rewards. Users are welcome to engineer dense rewards, but doing so may be nontrivial. Environments also have very diverse task distributions (as in the reset() function), so learned policies must generalize.
For Imitation Learning
As with RL, generalization across tasks is a major challenge for imitation learning. Furthermore, we supply some demonstrations, but they are typically suboptimal, multimodal, and limited in quantity. Users are welcome to collect their own demonstrations.
For Language Models
The physical reasoning required in KinDER is not easy to represent in natural language alone. Vision-language and vision-language-action models may fare better, but the tasks in KinDER are beyond the capabilities of current VLMs and VLAs.* (*This is an empirical claim that we will test!)
For Hierarchical Approaches
Approaches that first decide "what to do" and then decide "how to do it" will run into difficulties in KinDER when there are couplings between these high-level and low-level decisions. For example, the exact grasp of an object may determine whether the object can later be placed into a tight space.
For Task and Motion Planning
KinDER does not provide any models for TAMP. Users are welcome to engineer their own, but doing so may be nontrivial. Furthermore, some environments in KinDER are meant to strain the assumptions that are sometimes made in TAMP. Finally, some environments contain many objects, which may make planning slow even when models are available.
:octocat: Contributing
:ballot_box_with_check: Requirements
- Python >=3.10, <3.13
- Tested on MacOS Monterey and Ubuntu 22.04 (but we aim to support most platforms)
:wrench: Installation
We strongly recommend uv. The steps below assume that you have uv installed. If you do not, just remove uv from the commands and the installation should still work.
Then, choose one of the following based on you need:
uv pip install -r optional_prpl_requirements/core.txt && uv pip install -e .- Installs only core dependencies (matplotlib, numpy, relational_structs, prpl_utils)uv pip install -r prpl_requirements.txt && uv pip install -e .uv pip install -e ".[all]"- Installs everything (excluding develop)uv pip install -r optional_prpl_requirements/kinematic2d.txt && uv pip install -e ".[kinematic2d]"- Installs only core + kinematic2d dependencies (no pybullet)uv pip install -r optional_prpl_requirements/dynamic2d.txt && uv pip install -e ".[dynamic2d]"- Installs only core + dynamic2d dependenciesuv pip install -e ".[tidybot]"- Installs only core + tidybot dependenciesuv pip install -r optional_prpl_requirements/kinematic3d.txt && uv pip install -e ".[kinematic3d]"- Installs only core + kinematic3d dependenciesuv pip install -r prpl_requirements.txt && uv pip install -e ".[develop]"- Installs all + development tools- Compositionally install the dependencies like
[kinematic2d,kinematic3d]
:microscope: Check Installation
Run ./run_ci_checks.sh. It should complete with all green successes.
:mag: General Guidelines
- All checks must pass before code is merged (see
./run_ci_checks.sh) - All code goes through the pull request review process
:new: Adding New Environments
Some new environment requests are in Issues. To add a new environment, please see the examples in src/kinder/env. Also consider:
- Environments are registered in
src/kinder/__init__.py - Each environment should have at least one demonstration (see
scripts/collect_demos.py) - After collecting a demonstraction, create a video with
scripts/generate_demo_video.py, which will be used in the autogenerated documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kindergarden-0.0.1-py3-none-any.whl.
File metadata
- Download URL: kindergarden-0.0.1-py3-none-any.whl
- Upload date:
- Size: 61.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b4a965944ba7ce9f95a3dd3f48c5d293be20a0cde6e1bfee2095183e84f4d13
|
|
| MD5 |
5002a8144e06d6914fc0b7bb72d14fa7
|
|
| BLAKE2b-256 |
670a69861c805c917dfcadc8e885eac133c763a8fe98fcf04968de07273989d7
|