A toolbox for generating offline datasets from Gymnasium environments for world-model research
Project description
gym_wm - World Model Dataset Generator
A Python toolbox for generating offline datasets from Gymnasium environments for world-model research using Minari.
Features
- ๐ฎ Multi-environment support: Gymnasium MuJoCo, Gymnasium-Robotics (Fetch, Maze, Hand), and classic control
- ๐ธ RGB observations: Automatic capture of visual observations alongside proprioceptive state
- ๐๏ธ Minari integration: Store datasets in Minari format for easy sharing and loading
- ๐ฏ Multiple policies: Random sampling or PD controller for goal-conditioned environments
- ๐ Visualization tools: Episode frames, trajectory plots, and dataset summaries
- ๐ฌ Video export: Create MP4 videos from collected episodes
- ๐ป CLI & Python API: Use from command line or import as a package
Installation
# Clone the repository
git clone https://github.com/unknown/gym_wm_dataset_generator.git
cd gym_wm_dataset_generator
# Create environment (using micromamba)
mm create -n gym_wm python=3.11
mm activate gym_wm
# Install with uv (recommended)
uv pip install -e ".[dev]"
# Or install dependencies only
uv pip install -e .
Dependencies
Core dependencies:
gymnasium>=1.0.0gymnasium-robotics>=1.3.0minari>=0.5.0numpy>=1.24.0typer>=0.12.0rich>=13.0.0loguru>=0.7.0matplotlib>=3.7.0imageio>=2.31.0
Quick Start
Command Line Interface
# List supported environments
gym-wm list-envs
# Collect a dataset
gym-wm collect PointMaze_UMaze-v3 --episodes 100
# Collect with custom options
gym-wm collect FetchReach-v4 -n 50 --policy pd --img-size 128 --seed 42
# List collected datasets
gym-wm list-datasets
# Inspect a dataset
gym-wm inspect pointmaze-umaze-v3/random-v0
# Visualize a dataset
gym-wm visualize pointmaze-umaze-v3/random-v0 --episode 0
# Generate dataset summary
gym-wm visualize pointmaze-umaze-v3/random-v0 --summary
# Create video from episode
gym-wm visualize pointmaze-umaze-v3/random-v0 --save-video
# Compare multiple datasets
gym-wm compare dataset1 dataset2
Python API
import gym_wm
# List available environments
envs = gym_wm.list_environments()
print(envs.keys()) # ['mujoco', 'fetch', 'maze', 'hand', 'classic_control']
# Collect a dataset with random policy
dataset = gym_wm.collect_dataset(
env_id="PointMaze_UMaze-v3",
num_episodes=100,
dataset_name="pointmaze/my-dataset-v0",
img_height=84,
img_width=84,
)
print(f"Collected {dataset.total_episodes} episodes, {dataset.total_steps} steps")
# Collect with PD controller policy (for goal-conditioned envs)
policy = gym_wm.create_pd_policy(kp=2.0, kd=0.2)
dataset = gym_wm.collect_dataset(
env_id="FetchReach-v4",
num_episodes=50,
policy=policy,
)
# Load an existing dataset
dataset = gym_wm.load_dataset("pointmaze/my-dataset-v0")
# Visualize dataset
gym_wm.visualize_dataset(dataset, episode_idx=0)
# Plot specific visualizations
fig = gym_wm.plot_episode_frames(dataset, episode_idx=0)
fig = gym_wm.plot_episode_trajectory(dataset, episode_idx=0)
fig = gym_wm.plot_dataset_summary(dataset)
# Create video
video_path = gym_wm.create_episode_video(dataset, episode_idx=0, fps=30)
# Inspect dataset
info = gym_wm.inspect_dataset("pointmaze/my-dataset-v0")
print(f"Has images: {info['has_images']}")
print(f"Image shape: {info.get('image_shape')}")
Using DatasetConfig
from gym_wm import DatasetConfig, collect_dataset_from_config
config = DatasetConfig(
env_id="Ant-v5",
num_episodes=100,
dataset_name="ant/random-v0",
img_height=128,
img_width=128,
author="Your Name",
author_email="your@email.com",
seed=42,
)
dataset = collect_dataset_from_config(config)
Supported Environments
MuJoCo
- Reacher-v5, Ant-v5, HalfCheetah-v5, Hopper-v5
- Walker2d-v5, Humanoid-v5, Swimmer-v5, Pusher-v5
- InvertedPendulum-v5, InvertedDoublePendulum-v5
Fetch (Gymnasium-Robotics)
- FetchReach-v4, FetchPush-v3, FetchSlide-v3, FetchPickAndPlace-v3
Maze (Gymnasium-Robotics)
- PointMaze_UMaze-v3, PointMaze_Medium-v3, PointMaze_Large-v3
- AntMaze_UMaze-v5, AntMaze_Medium-v5, AntMaze_Large-v5
Hand (Gymnasium-Robotics)
- AdroitHandDoor-v1, AdroitHandHammer-v1
- AdroitHandPen-v1, AdroitHandRelocate-v1
Classic Control
- CartPole-v1, Pendulum-v1, Acrobot-v1
- MountainCar-v0, MountainCarContinuous-v0
CLI Reference
| Command | Description |
|---|---|
gym-wm collect <env_id> |
Collect a new dataset |
gym-wm visualize <dataset> |
Visualize a dataset |
gym-wm list-envs |
List supported environments |
gym-wm list-datasets |
List local datasets |
gym-wm inspect <dataset> |
Show dataset details |
gym-wm compare <ds1> <ds2> |
Compare datasets |
gym-wm version |
Show version info |
Common Options
--episodes, -n Number of episodes to collect (default: 100)
--name Custom dataset name
--img-size Image size in pixels (default: 84)
--output, -o Output directory
--seed Random seed for reproducibility
--policy Policy type: random, pd (default: random)
--verbose, -v Enable verbose logging
Development
# Install development dependencies
uv pip install -e ".[dev,test]"
# Run tests
pytest
# Run linter
ruff check src/
# Format code
ruff format src/
# Type checking
mypy src/gym_wm
Project Structure
gym_wm_dataset_generator/
โโโ src/
โ โโโ gym_wm/
โ โโโ __init__.py # Public API exports
โ โโโ cli.py # Typer CLI application
โ โโโ core/
โ โ โโโ config.py # DatasetConfig dataclass
โ โ โโโ environments.py # Environment utilities
โ โ โโโ generator.py # Dataset collection
โ โ โโโ policies.py # Policy implementations
โ โ โโโ visualize.py # Visualization tools
โ โโโ utils/
โ โโโ inspect.py # Dataset inspection
โโโ tests/
โโโ pyproject.toml
โโโ README.md
License
MIT License - see LICENSE for details.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gym_wm-0.1.0.tar.gz
(25.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
gym_wm-0.1.0-py3-none-any.whl
(27.7 kB
view details)
File details
Details for the file gym_wm-0.1.0.tar.gz.
File metadata
- Download URL: gym_wm-0.1.0.tar.gz
- Upload date:
- Size: 25.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42dcce1772f8a50afbb79a80cac0643469c971ef474af8b9bd20150f37708a21
|
|
| MD5 |
13409e6c2863259272ad77c6becdf3dd
|
|
| BLAKE2b-256 |
cfe2b5bf231cdb5d33a91ccb2429bd92c91fe3ba2f89a665137f22c6923ecae8
|
File details
Details for the file gym_wm-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gym_wm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbf0d789b898f439bd75ca732e4000c0445e8e3b01baffdad1fb65ef1dd11515
|
|
| MD5 |
f1ec5cd131e2d2904d1b969c95e66118
|
|
| BLAKE2b-256 |
f8ab81a09b951a1e80037376f723c333c4a144f166f955545a3c860b968cc54d
|