Photoreal Filament PBR rendering for GPU-resident MuJoCo (MJWarp), zero-copy to PyTorch
Project description
mujofil-warp
Photoreal PBR rendering for GPU-resident MuJoCo (MJWarp), zero-copy to PyTorch.
MJWarp simulates thousands of parallel MuJoCo worlds entirely on the GPU, but its built-in batch renderer is a deliberately low-fidelity single-hit raycaster (flat Lambertian, no PBR / IBL / reflections, and it cannot load GLB environments).
mujofil-warp pairs MJWarp's GPU-resident physics with
Google Filament's physically-based
renderer (PBR materials, image-based lighting, soft shadows, SSAO) and delivers
each rendered frame straight to PyTorch as a CUDA tensor — no CPU round-trip.
Highlights
- Zero-copy to
torch.cuda. Filament renders into GPU memory that CUDA imports directly; observations arrive astorch.cudatensors with no GPU→CPU→GPU bounce. - GPU-resident pipeline. MJWarp steps physics on the GPU; only a tiny transform array crosses to the host. Pixels never leave the GPU.
- Photoreal. Full PBR metalness/roughness, IBL, soft shadows, SSAO, MSAA, filmic tone mapping — renders complete GLB environments MJWarp/MuJoCo can't.
- Two backends. An OpenGL single-sync path and a Vulkan shared-device path, selectable at runtime.
Performance (RTX 4060 Laptop, 8 GiB)
All numbers are env-steps/s (= cameras/s), MJWarp GPU physics → torch.cuda.
vs vanilla MuJoCo, same scene, same workload (ours adds PBR + zero-copy):
| 128px N=512 | 256px N=512 | 256px N=1024 | |
|---|---|---|---|
| mujofil-warp (GL) | 10,675 | 9,949 | 10,628 |
vanilla mujoco.Renderer |
8,394 | 4,808 | 5,021 |
| speedup | 1.27× | 2.07× | 2.12× |
We beat vanilla MuJoCo by 1.25–2.12× on equal work — the gap widens at higher resolution because zero-copy avoids the CPU readback that scales with pixels.
Full photoreal warehouse (3 GLB meshes + IBL + 16 spotlights + SSAO — geometry vanilla MuJoCo and MJWarp cannot even load): ~3,200 cam/s at 128px, holding flat from N=64 to N=2048.
GL vs Vulkan backend (full warehouse): the GL single-sync path is 1.3×
faster and, critically, its sync cost is constant across N (one flushAndWait),
where the Vulkan path's grows linearly with batch size.
vs MJWarp's own raycaster: MJWarp scales to ~42,000 cam/s at N=2048 — but that
is flat Lambertian on bare objects (no PBR/IBL, no GLB environments). At small
N (≤32) mujofil-warp is faster and photoreal; at large N MJWarp wins raw
throughput by trading away all visual fidelity. Different categories: MJWarp is a
parallel raycaster, this is a photoreal rasterizer.
Quickstart
import mujoco, mujoco_warp as mjw, warp as wp, torch
from mujofil_warp import WarpRenderer
mjm = mujoco.MjModel.from_xml_path("scene.xml")
M = mjw.put_model(mjm)
d = mjw.make_data(mjm, nworld=32)
host = [mujoco.MjData(mjm) for _ in range(32)]
r = WarpRenderer(width=256, height=256, batch_size=32, preset="high")
r.load_model(mjm)
mjw.step(M, d); wp.synchronize()
gx = d.geom_xpos.numpy(); gm = d.geom_xmat.numpy().reshape(32, mjm.ngeom, 9)
for i, h in enumerate(host):
h.geom_xpos[:] = gx[i]; h.geom_xmat[:] = gm[i]
obs = r.render_batch(mjm, host, cam_id=0) # (32, 256, 256, 4) uint8 torch.cuda
See examples/minimal_render.py for a runnable demo.
Quality toggles
Every fidelity feature is an independent toggle so you can reproduce the
throughput/fidelity trade-offs in benchmarks/ on your own hardware:
from mujofil_warp import WarpRenderer, make_config
# keyword toggles
r = WarpRenderer(width=256, batch_size=32, ssao=False, shadows=True, msaa=True)
# or a named preset, optionally overriding individual toggles
r = WarpRenderer(width=256, batch_size=32, preset="fast") # SSAO off, ~2x
r = WarpRenderer(width=256, batch_size=32, preset="high", bloom=True)
# or an explicit config
cfg = make_config(width=256, height=256, batch_size=32, exposure=1.6)
r = WarpRenderer(config=cfg)
| Toggle | Effect | Notes |
|---|---|---|
ssao |
screen-space ambient occlusion | biggest cost — ~2× faster when off |
ssao_quality |
SSAO quality low/medium/high/ultra |
affects look more than speed |
ssao_ssct |
SSAO cone tracing (contact shadows) | small extra cost on top of SSAO |
shadows |
soft shadow maps | |
msaa / msaa_samples |
multi-sample AA | 2 / 4 / 8 |
bloom |
HDR bloom | off by default |
fxaa |
fast approximate AA | alternative to MSAA |
exposure |
linear exposure | before tone mapping |
tone_mapping |
FILMIC vs LINEAR | |
dithering |
temporal dithering | reduces banding |
Presets: high (photoreal, default), medium (high-quality SSAO, no cone
tracing), fast (SSAO off, ~2×), ultra (8× MSAA + bloom), raw (no AO/shadows/AA,
~3×).
Backends
Select at runtime with MUJOFIL_WARP_BACKEND:
gl(default) — OpenGL single-sync. Renders N worlds into N imported GL textures bracketed by oneflushAndWait, then exports via GL↔CUDA interop. Sync cost is constant in N; fastest in the warehouse. Requires an X display (DISPLAY); when none is available it automatically falls back to Vulkan.vulkan— shared Vulkan device + exportable swapchain + CUDA external-memory import. Works fully headless (no X), but the 2-frame in-flight cap makes its sync cost grow with batch size.
# default is gl; force a backend explicitly with the env var:
MUJOFIL_WARP_BACKEND=gl python examples/minimal_render.py --preset high
MUJOFIL_WARP_BACKEND=vulkan python examples/minimal_render.py --preset high
Installation
pip install mujofil-warp
The wheel is self-contained: Filament and the CUDA runtime are statically
baked in, the compiled materials ship inside it, and libc++ is bundled. There
is no CUDA toolkit, no Filament, and no mujofil to install — the only hard
requirement at runtime is an NVIDIA GPU + driver.
Supported environments
Because the package contains no CUDA device code (only host-side runtime calls), a single wheel is portable across GPUs and driver versions:
| Dimension | Support |
|---|---|
| GPU | Any NVIDIA GPU (Turing / Ampere / Ada / Hopper / …) — no compute-capability lock-in |
| Driver / CUDA | NVIDIA driver ≥ R525 (CUDA 12.0+). One wheel, all newer drivers |
| OS | Linux x86_64, glibc ≥ 2.34 (Ubuntu 22.04+, Debian 12+, RHEL/Alma/Rocky 9+, Fedora 35+) |
| Python | CPython 3.10 – 3.13 |
Not yet supported: aarch64 (Jetson/Grace), glibc < 2.34 (Ubuntu 20.04 / RHEL 8), non-NVIDIA GPUs. These need a from-source Filament build (planned).
Headless / display
Both backends are fully headless — no X server, no display, nothing extra to install beyond the NVIDIA driver:
- GL (default) uses surfaceless EGL, so it renders headless at full speed on a bare GPU server (cloud, cluster, container). This is the recommended path for vision-RL training.
- Vulkan is also headless (shared device + exportable swapchain).
GL auto-falls back to Vulkan only if the GL module fails to initialize.
Building from source
Most users never need this — pip install mujofil-warp ships prebuilt wheels.
Build from source only to hack on the C++ or target an unsupported environment.
Prerequisites (the native modules and Filament are built with Clang + libc++):
| Tool | Debian/Ubuntu | RHEL/Fedora/Alma |
|---|---|---|
| Clang + libc++ dev | clang libc++-dev libc++abi-dev |
clang + libc++ (LLVM release) |
| CUDA toolkit (headers + static cudart) | nvidia-cuda-toolkit |
cuda-cudart-devel-12-x cuda-driver-devel-12-x |
| EGL / GL dev headers | libegl1-mesa-dev libgl1-mesa-dev |
mesa-libEGL-devel mesa-libGL-devel |
| Build tools (source-built Filament only) | git cmake ninja-build |
git cmake ninja-build |
Then:
git clone https://github.com/tau-intelligence/mujofil-warp
cd mujofil-warp
CC=clang CXX=clang++ pip install .
How Filament is resolved (the GL backend's headless EGL rendering needs a
custom EGL-enabled Filament — Google's prebuilt Linux Filament is GLX-only).
CMakeLists.txt tries, in order:
FILAMENT_DIR=/path/to/egl-filamentif you set it — used as-is (fastest).- Download a prebuilt EGL Filament artifact (seconds). The default path.
- Build from source via
packaging/build_filament_egl.sh(~20–30 min) if the download is unavailable — this is the step that needs git/cmake/ninja.
So a plain pip install . is one command; supply FILAMENT_DIR to skip the
download/build entirely:
CC=clang CXX=clang++ FILAMENT_DIR=/path/to/egl-filament pip install .
The EGL Filament artifact is reproducible from source:
packaging/build_filament_egl.sh ./_filament_egl # clone + patch + build
Dev rebuilds (no full reinstall)
For iterating on the C++ without a full pip install, the two helper scripts
build the modules in place (point FILAMENT_DIR at the EGL Filament build):
bash native/build_gl.sh # OpenGL single-sync, headless EGL -> _mujofil_warp_gl
bash native/build.sh # Vulkan zero-copy -> _mujofil_warp
Architecture & porting
mujofil-warp is one core with pluggable rendering backends, so new platforms
are added as a backend — not a fork.
mujofil_warp/__init__.py Python API, presets, backend selection (shared)
native/render_module.cpp pybind bindings, batching (shared)
native/vendor/core/ scene / material / light bridge (shared)
native/renderer_gl.cpp Linux: surfaceless EGL + CUDA interop (backend)
native/renderer_warp.cpp Linux: Vulkan device + CUDA interop (backend)
Everything platform-specific lives behind the vf_mujoco::Renderer interface
(context creation, GPU→tensor interop). Adding macOS or Windows means
adding one renderer_*.{cpp,mm} implementing that interface — the scene,
material, lighting, Python API, and batching layers are reused unchanged.
- Windows would use a WGL/EGL context +
OPAQUE_WIN32external-memory handles for the CUDA interop. - macOS is a different target: there is no CUDA on Apple platforms, so a
Mac backend would use Filament's Metal backend and export to PyTorch via
MPS (
MTLBuffer→ torch-MPS) rather thantorch.cuda.
These are not yet implemented (they need the respective hardware to develop and validate on), but the codebase is structured so they slot in without a fork.
Layout
mujofil_warp/ Python package (WarpRenderer, make_config, presets)
native/ C++ renderer + pybind module + build scripts
renderer_gl.cpp OpenGL single-sync zero-copy backend
renderer_warp.cpp Vulkan shared-device zero-copy backend
render_module.cpp pybind bindings (shared by both backends)
examples/ runnable demos
benchmarks/ the benchmark suite behind the numbers above
spikes/ isolated feasibility proofs (GL↔CUDA, Vulkan↔CUDA, DLPack)
docs/ARCHITECTURE.md design + phased integration plan
Relationship to mujofil
mujofil-warp reuses the CPU-MuJoCo mujofil renderer's scene/material/light
source but is a separate build — the published mujofil package is untouched.
Use mujofil for high-fidelity CPU-MuJoCo vector-env rendering; use
mujofil-warp when you want MJWarp's GPU-resident physics with photoreal,
zero-copy observations.
License
Apache-2.0.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mujofil_warp-0.1.0.tar.gz.
File metadata
- Download URL: mujofil_warp-0.1.0.tar.gz
- Upload date:
- Size: 6.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7a2bc7b6c991177c14f2316425785aea64dac490d565e887996c54163291970
|
|
| MD5 |
6f024f1e158651dc970cde9f3501515a
|
|
| BLAKE2b-256 |
a13d0cfc9b04dd3ff5e6c3b7eda66045cb558fe9b550e72a6a7b91658858c790
|
Provenance
The following attestation bundles were made for mujofil_warp-0.1.0.tar.gz:
Publisher:
wheels.yml on tau-intelligence/mujofil-warp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mujofil_warp-0.1.0.tar.gz -
Subject digest:
d7a2bc7b6c991177c14f2316425785aea64dac490d565e887996c54163291970 - Sigstore transparency entry: 1823409371
- Sigstore integration time:
-
Permalink:
tau-intelligence/mujofil-warp@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tau-intelligence
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
wheels.yml@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mujofil_warp-0.1.0-cp313-cp313-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: mujofil_warp-0.1.0-cp313-cp313-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 10.6 MB
- Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d32f54e6412171fd7ffd47949c44cdf25a082455375c51a9a1d2955fb53e9bb6
|
|
| MD5 |
791407e6db548c34df010d0e8bc5765f
|
|
| BLAKE2b-256 |
d70a0fca4ad2df1e728c5a0cd17a025a74ec3636fdd4ccb24db38e491a08caaf
|
Provenance
The following attestation bundles were made for mujofil_warp-0.1.0-cp313-cp313-manylinux_2_34_x86_64.whl:
Publisher:
wheels.yml on tau-intelligence/mujofil-warp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mujofil_warp-0.1.0-cp313-cp313-manylinux_2_34_x86_64.whl -
Subject digest:
d32f54e6412171fd7ffd47949c44cdf25a082455375c51a9a1d2955fb53e9bb6 - Sigstore transparency entry: 1823409478
- Sigstore integration time:
-
Permalink:
tau-intelligence/mujofil-warp@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tau-intelligence
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
wheels.yml@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mujofil_warp-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: mujofil_warp-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 10.6 MB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc5b1a164bf7fd9135653ee55e92b3427769092c2620a8f130b7a9101c549faf
|
|
| MD5 |
acbdb5697c996361601d42f5381940eb
|
|
| BLAKE2b-256 |
4ab24a430cf2a3e37ae7d95345b9467cc43705128f24738cce83376a30ecff66
|
Provenance
The following attestation bundles were made for mujofil_warp-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl:
Publisher:
wheels.yml on tau-intelligence/mujofil-warp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mujofil_warp-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl -
Subject digest:
fc5b1a164bf7fd9135653ee55e92b3427769092c2620a8f130b7a9101c549faf - Sigstore transparency entry: 1823409565
- Sigstore integration time:
-
Permalink:
tau-intelligence/mujofil-warp@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tau-intelligence
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
wheels.yml@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mujofil_warp-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: mujofil_warp-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 10.6 MB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a16c20a56a0e9bb4452f4a7d1e83eb40fab72e294b3519815b0f8d8f62c4e0f9
|
|
| MD5 |
b31fb2e6bd179a023a1ea1440795953b
|
|
| BLAKE2b-256 |
741a08ec901b4069fa8d6bd2237654456c59b4bbfb4f1bdc587dcbf0c58eb637
|
Provenance
The following attestation bundles were made for mujofil_warp-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl:
Publisher:
wheels.yml on tau-intelligence/mujofil-warp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mujofil_warp-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl -
Subject digest:
a16c20a56a0e9bb4452f4a7d1e83eb40fab72e294b3519815b0f8d8f62c4e0f9 - Sigstore transparency entry: 1823409539
- Sigstore integration time:
-
Permalink:
tau-intelligence/mujofil-warp@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tau-intelligence
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
wheels.yml@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mujofil_warp-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: mujofil_warp-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 10.6 MB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70c34317f415d4376a3b94e552f3b666f0b38da29befcbb99b5eb3468cdb64f9
|
|
| MD5 |
6dd8c6a1383414214389920e1048829d
|
|
| BLAKE2b-256 |
f50de564cfc3854c11469c2afe775630aab55149e18996af4be8fef071b4ad90
|
Provenance
The following attestation bundles were made for mujofil_warp-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl:
Publisher:
wheels.yml on tau-intelligence/mujofil-warp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mujofil_warp-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl -
Subject digest:
70c34317f415d4376a3b94e552f3b666f0b38da29befcbb99b5eb3468cdb64f9 - Sigstore transparency entry: 1823409424
- Sigstore integration time:
-
Permalink:
tau-intelligence/mujofil-warp@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tau-intelligence
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
wheels.yml@249d9d6fc24be444b12cc3a9213fbda274ca4833 -
Trigger Event:
push
-
Statement type: