WAI: World AI format unifying various 3D/4D datasets
Reason this release was yanked:
deprecated
Project description
Core library for the WorldAI format (WAI)
Core functionality to access and process data in the WorldAI format (WAI). Lightweight and easy to use, with a focus on modularity and extensibility.
Features
- IO functions for reading and writing WAI data
- Intersection checks between 2d/3d geometric primitives
- Batched camera and geometric processing functions (e.g. (un)projection, transformations, etc.)
Install
We recommend using python 3.12.9. WAI is likely to work also with other python versions, but was only tested on 3.12.9. Pip install (recommended):
pip install wai-core
Clone and install from source (for development)
git clone https://ghe.oculus-rep.com/mb-research/wai-core.git
cd wai-core
pip install -e ".[dev]"
WAI dataset format
Folder structure
We follow Nerfstudio’s folder structure and extend it by additional (optional) modalities. The general folder and file structure is as follows:
<dataset name>
├── <first scene name>
│ ├── scene_meta.json OR scene_meta_distorted.json
│ │
│ ├── images OR images_distorted
│ │ ├── <frame_id_1>.[png|jpg]
│ │ :
│ │ └── <frame_id_n>.[png|jpg]
│ │
│ ├── [Optional] <depth> (GT depth, as specified in scene_meta.json)
│ │ ├── <frame_id_1>.exr
│ │ :
│ │ └── <frame_id_n>.exr
│ │
│ ├── [Optional] masks
│ │ ├── <frame_id_1>.png
│ │ :
│ │ └── <frame_id_n>.png
│ │
│ └── [Optional] Any extra modalities, as specified in scene_meta.json
:
└── <last scene name>
scene_meta format
The scene_meta.json format is an extension of Nerfstudio's transforms.json.
The general structure is:
{
"scene_name": "00dd871005", // Unique scene name
"dataset_name": "scannetppv2", // Unique dataset name
"version": "0.1", // WAI format version
"last_modified": "2025-02-10T09:43:48.232022", // ISO datetime format
"shared_intrinsics": true/false, // Same intrinsics for all cameras?
// Camera model type [PINHOLE, OPENCV, OPENCV_FISHEYE]
"camera_model": "PINHOLE",
// Convention for cam2world extrinsics (must be "opencv")
"camera_convention": "opencv",
// <camera_coeff_name>: // camera coefficients like fl_x, cx, h,...
// Per-frame intrinsics and extrinsics parameters
"frames": <see below>,
// Scene-level modalities with default mapping like gt_points3D -> pts3d.npy
"scene_modalities": <see below>,
// Frame-level modalities with default mapping like pred_depth -> metric3dv2
"frame_modalities": <see below>,
// Transform applied on original poses to get poses stored in `frames`
"_applied_transform": <see below>,
// All transforms to convert the original poses to poses stored in `frames`
"_applied_transforms": <see below>,
}
Per-frame intrinsics can also be defined in the frames field, which is a list of dictionaries with the following structure:
{
"frames": [
{
// Unique name to identify a frame
"frame_name": "<frame_name>",
// Relative path to frame, required for Nerfstudio compatibility
"file_path": "<images>/<frame_name>.<ext>",
// 4x4 flattened list of extrinsics in OpenCV format
"transform_matrix": [[1, 0, 0, 0], ... [0, 0, 0, 1]],
// Relative path to frame modality (optional)
"<modality>_path": "<modality_path>/<frame_name>.<ext>",
// Additional intrinsics for this frame (optional)
"camera_model": "PINHOLE",
// <camera_coeff_name>: // camera coefficients like fl_x, cx, h,...
},
...
]
}
Example:
{
"scene_name": "00dd871005", // unique scene name
"dataset_name": "scannetppv2", // unique dataset name
"version": "0.1",
"last_modified": "2025-02-10T09:43:48.232022",
"camera_model": "PINHOLE", // camera model type [PINHOLE, OPENCV, OPENCV_FISHEYE]
"camera_convention": "opencv", // camera convention used for cam2world extrinsics (different to Nerfstudio!)
"fl_x": 1072.0, // focal length x
"fl_y": 1068.0, // focal length y
"cx": 1504.0, // principal point x
"cy": 1000.0, // principal point y
"w": 3008, // image width
"h": 2000, // image height
"frames": [
{
"frame_name": "000000",
"file_path": "images/000000.png", // required by Nerfstudio
"transform_matrix": [
[1.0, 0.0, 0.0, 0.0],
[0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 1.0]
], // required by Nerfstudio
"image": "images/000000.png", // same as file_path
"metric3dv2_depth": "metric3dv2/v0/depth/000000.png",
"metric3dv2_depth_conf": "metric3dv2/v0/depth_confidence/000000.exr",
"fl_x": 1234, // specific focal length for this frame
"w": 1000 // specific width for this frame
},
...
],
"scene_modalities": {
"gt_pts3d": {
"path": "global_pts3d.npy", //path to a scene_level point cloud
"format": "numpy"
}
},
"frame_modalities": {
"pred_depth": {
"frame_key": "metric3dv2_depth", //default mapping of pred_depth to frame modality
"format": "depth"
},
"image": {
"frame_key": "image", //default mapping of pred_depth to modality
"format": "image",
},
"depth_confidence": {
"frame_key": "metric3dv2_depth_conf",
"format": "scalar"
}
},
"_applied_transformation": [ // e.g. the transformation from opengl to opencv
[1.0, 0.0, 0.0, 0.0],
[0.0, -1.0, 0.0, 0.0],
[0.0, 0.0, -1.0, 0.0],
[0.0, 0.0, 0.0, 1.0]
],
"_applied_transformations": {
"opengl2opencv": [ // e.g. applied from OpenGL to OpenCV before
[1.0, 0.0, 0.0, 0.0],
[0.0, -1.0, 0.0, 0.0],
[0.0, 0.0, -1.0, 0.0],
[0.0, 0.0, 0.0, 1.0]
]}
}
Standard formats (WIP!)
We provide a set of default read-/write functions for the following data types:
image: usingpillowfor reading/writingdepth: depth images stored as exr-files usingopencvnormals: normals images stored as exr-files usingopencvscalar: scalar images stored as exr-files usingopencvreadable: json/yaml usingorjson/yamlfor reading/writingnumpy: load stored (compressed) numpy arraysbinary: binary mask stored as 1-bit or 8-bit imageptz: load/store compressed Pytorch filesmmap: memory-mapped numpy array for fast random-accessscene_meta: load and store our default scene-level info file (see scene_meta format)
All these formats can be conveniently loaded and stored using load_data, store_data:
from wai import load_data, store_data
# --- load modalities ---
scene_meta = load_data(<path_to_scene_meta.json>, "scene_meta")
info = load_data(<path_to_readable.json>)
image = load_data(<path_to_image.png>) # suffix resolves format to "image"
depth = load_data(<path_to_depth_image.exr>, "depth")
confidences = load_data(<path_to_image.exr>, "scalar")
normals = load_data(<path_to_image.exr>, "normals")
mask = load_data(<path_to_mask.png>, "binary")
array = load_data(<path_to_image.npy>, "numpy")
data = load_data(<path_to_ptz.ptz>) # suffix resolves format to "ptz"
# ...
# --- store modalities ---
store_data(<target_path.json>, scene_meta, "scene_meta")
store_data(<target_path.json>, info) # default for json: readable
store_data(<target_path.png>, image) # default for png: 'image'
store_data(<target_path.exr>, depth_img, "depth")
store_data(<target_path.exr>, confidences, "scalar")
store_data(<target_path.exr>, normals, "normals")
store_data(<target_path.png>, mask, "binary")
store_data(<target_path.npy>, array, "numpy")
store_data(<target_path.ptz>, data) # suffix resolves format to "ptz"
# ...
Implementation details can be found in utils.io.
Create a simple pytorch dataset using WAI
We provide a basic dataset to get you started with training your models with WAI datasets. This is a minimal example on how to construct a WAI dataset:
from wai import BasicSceneframeDataset
from box import Box
cfg = Box(
{
"root": "/fsx/xrtech/data/scannetppv2",
"frame_modalities": ["image", "pred_depth"],
}
)
dataset = BasicSceneframeDataset(cfg)
item = dataset[0]
print(f"Fetched an item from the WAI dataset with the following keys: {item.keys()}")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wai_core-1.0.4.tar.gz.
File metadata
- Download URL: wai_core-1.0.4.tar.gz
- Upload date:
- Size: 63.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
943cbde9ea0530ffcfd8e4ddfa110fefdcbb67bbff55b0b564e3b97c5a0e1fd4
|
|
| MD5 |
25923399d954a3e5457a3e22e26defe0
|
|
| BLAKE2b-256 |
83fe57b42193d4dad725dbdcf3cbd2a65af9b780959cb0aaf708d94b55f04be9
|
File details
Details for the file wai_core-1.0.4-py3-none-any.whl.
File metadata
- Download URL: wai_core-1.0.4-py3-none-any.whl
- Upload date:
- Size: 59.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd1a04d203bc65d6ffd1df9c65ea4a12d3cf39c49724dc2f115c78fdb1db8afc
|
|
| MD5 |
4af9437ef9da8c233fb92140ba901d5b
|
|
| BLAKE2b-256 |
899c6d9f8b1e1fa10e24efeb6075470a3b58f77b3c5ff3032af62fa808cb5a1c
|