3D human pose estimation from video using DiffPose + MixSTE
Project description
DiffPose-Video
3D human pose estimation from arbitrary video using MixSTE (2D→3D lifting) and DiffPose (diffusion-based refinement).
This package wraps the original DiffPose research code with a clean inference pipeline, an interactive visualisation dashboard, and a video renderer — all accessible as CLI commands after a single pip install.
Paper: DiffPose: Toward More Reliable 3D Pose Estimation, CVPR 2023.
Install
# 1. Install PyTorch for your CUDA version first (example: CUDA 12.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
# 2. Install this package
pip install diffpose-video
Note:
onnxruntime-gpu==1.20.1is pinned because later versions have a broken CUDA provider on some systems.
Download pretrained checkpoints
diffpose-download
Downloads all pretrained weights to ~/.cache/diffpose_video/checkpoints/. Safe to re-run — skips files that already exist.
Usage
1. Run inference on a video
diffpose-infer --input video.mp4 --output_dir results/
Config and checkpoint paths default to the bundled config and ~/.cache/diffpose_video/checkpoints/ (populated by diffpose-download). Override with --config, --model_pose, --model_diff if needed.
Output: results/<video_name>.npz containing:
poses_3d—(T, 17, 3)root-relative 3D joint positionskeypoints_2d—(T, 17, 3)pixel-space 2D detections + confidence
Process a whole folder of videos by passing a directory to --input.
2. Interactive dashboard
diffpose-explore \
--npz results/video.npz \
--video video.mp4 \
--fps 30
Opens a Plotly Dash app at http://localhost:8050 with:
- Synchronized video playback with 2D skeleton overlay
- Animated 3D skeleton
- X / Y / Z trajectory graphs per joint
- Play/pause + frame scrubber, all linked
3. Render a side-by-side MP4
diffpose-visualise \
--npz results/video.npz \
--video video.mp4 \
--output results/video_vis.mp4
Produces a video with the original footage (+ 2D overlay) on the left and the animated 3D skeleton on the right.
Docker
The image is self-contained — checkpoints and config are baked in at build time.
# Build once
docker compose build
# Inference (outputs land in ./results/)
export VIDEOS_DIR=/path/to/your/videos
docker compose run infer --input /videos/clip.mp4
# Render side-by-side MP4
docker compose run visualise \
--npz /results/clip.npz --video /videos/clip.mp4 --output /results/clip_vis.mp4
# Interactive dashboard — open http://localhost:8050
docker compose run --service-ports explore \
--npz /results/clip.npz --video /videos/clip.mp4
Citation
@InProceedings{gong2023diffpose,
author = {Gong, Jia and Foo, Lin Geng and Fan, Zhipeng and Ke, Qiuhong and Rahmani, Hossein and Liu, Jun},
title = {DiffPose: Toward More Reliable 3D Pose Estimation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file diffpose_video-0.2.0.tar.gz.
File metadata
- Download URL: diffpose_video-0.2.0.tar.gz
- Upload date:
- Size: 61.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c62b569c77a6ed9c2c1c9dd29abb1ceee071e715c3e677a88dd8a67936caff1
|
|
| MD5 |
48592c4130856f0584e31a0b8e0a1b9e
|
|
| BLAKE2b-256 |
590d2ca5a54e861ecd543fdab85adefa8cc32213a4961822e8373628e3d8541b
|
File details
Details for the file diffpose_video-0.2.0-py3-none-any.whl.
File metadata
- Download URL: diffpose_video-0.2.0-py3-none-any.whl
- Upload date:
- Size: 76.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e17232f8df3d884ec335de70541509f045e10141957167b28926b5dbdb57046d
|
|
| MD5 |
cf32441387ee59d64ced5df9b73491b7
|
|
| BLAKE2b-256 |
1dbf6ed3da0bd6cf33e33aa5ca825cee6f389727956aaaf7b0e5b129a38a2757
|