Feed-forward framework for online dynamic 3D reconstruction from uncalibrated video streams

Project description

[ICLR 2026] StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Overview

StreamSplat is a fully feed-forward framework that instantly transforms uncalibrated video streams of arbitrary length into dynamic 3D Gaussian Splatting representations in an online manner.

Feed-forward inference: No per-scene optimization required
Camera-free: Works directly with uncalibrated monocular videos
Dynamic scene modeling: Handles both static and dynamic scene elements through polynomial motion modeling
Probabilistic Gaussian prediction: Uses truncated Gaussian models for robust Gaussian position modeling
Two-stage training: Stage 1 trains the static encoder, Stage 2 trains the dynamic decoder

Videos

https://github.com/user-attachments/assets/d72b75de-e07a-4d81-a23f-f85e99c9bd05

https://github.com/user-attachments/assets/d975b425-1e4c-4dc9-8398-0a1a43d50e67

https://github.com/user-attachments/assets/d2a064f5-f4d7-46c4-8a3a-28fcb8679df5

https://github.com/user-attachments/assets/466a222d-f3b5-447a-84c0-a47538071c05

Environment Setup

Create conda environment:

conda env create -f environment.yml
conda activate StreamSplat

Build the differentiable Gaussian rasterizer:

cd submodules/diff-gaussian-rasterization-orth
pip install .

Download pretrained depth model:

Download Depth Anything V2 checkpoint and place it in the checkpoints/ directory:

mkdir -p checkpoints
# Download depth_anything_v2_vitl.pth from https://github.com/DepthAnything/Depth-Anything-V2
# Place it in checkpoints/depth_anything_v2_vitl.pth

Dataset Preparation

StreamSplat supports training on multiple datasets. All datasets require pre-computed depth maps using Depth Anything V2.

Supported Datasets

Dataset	Type	Description
RealEstate10K	Static	Real estate videos
CO3Dv2	Static	Object-centric multi-view
DAVIS	Dynamic	High-quality videos
YouTube-VOS	Dynamic	Large-scale videos

Preprocessing Depth Maps

Use the provided script to preprocess depth maps for DAVIS (similar scripts can be adapted for other datasets):

python preprocess_depth_davis.py --root_path /path/to/davis

Configure Dataset Paths

Edit configs/options.py and configs/options_decoder.py to set dataset paths:

root_path_re10k: str = "/path/to/re10k"
root_path_co3d: str = "/path/to/co3d"
root_path_davis: str = "/path/to/davis"
root_path_vos: str = "/path/to/youtube-vos"

Training

Configure Accelerate

Create an accelerate config file (or use the provided acc_configs/gpu8.yaml):

accelerate config

Stage 1: Train Static Encoder

Train the static encoder on combined datasets:

accelerate launch --config_file acc_configs/gpu8.yaml train.py combined \
    --workspace /path/to/workspace/encoder_exp

Stage 2: Train Dynamic Decoder

After Stage 1 completes, train the dynamic decoder with the frozen encoder:

accelerate launch --config_file acc_configs/gpu8.yaml train_decoder.py combined \
    --workspace /path/to/workspace/decoder_exp \
    --encoder_path /path/to/workspace/encoder_exp/model.safetensors

Monitoring Training

Training progress is logged to Weights & Biases. Set up wandb before training:

wandb login

Checkpoints are saved every 10 epochs and every 30 minutes to checkpoint_latest/.

Inference

Download our pretrained checkpoint at Google Drive and place it in the checkpoints/ directory:

python splat_inference.py \
    --resume checkpoints/streamsplat.safetensors \
    --input_frames_path=/path/to/rgb_frames \
    --input_depths_path=/path/to/depth_maps

Citation

If you find this work useful, please cite:

@article{wu2025streamsplat,
    title={StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams}, 
    author={Zike Wu and Qi Yan and Xuanyu Yi and Lele Wang and Renjie Liao},
    journal={arXiv preprint arXiv:2506.08862},
    year={2025},
}

Acknowledgments

This project builds upon several excellent works:

3D Gaussian Splatting for the differentiable rasterization
diff-gaussian-rasterization for the depth & alpha rendering
DINOv2 for vision features
Depth Anything V2 for monocular depth estimation
Gamba and MVGamba for the codebase and training framework
Nutworld for orthographic rasterization
edm for data augmentation

Project details

Release history Release notifications | RSS feed

0.1.4

Mar 9, 2026

0.1.3

Mar 6, 2026

0.1.2

Mar 6, 2026

0.1.1

Mar 6, 2026

This version

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamsplat-0.1.0.tar.gz (94.6 kB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

streamsplat-0.1.0-py3-none-any.whl (126.5 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file streamsplat-0.1.0.tar.gz.

File metadata

Download URL: streamsplat-0.1.0.tar.gz
Upload date: Mar 4, 2026
Size: 94.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for streamsplat-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`91aad70ced179eaf78c64522672ba96ce18b4ad44d4cb2f9f8a7b84d303fe21a`
MD5	`d17c6dc6b2246cfba579c264c488f13d`
BLAKE2b-256	`67514da3994b80341b5a03f43dc86dfb2a24fca34011791c2897d3ff438f5947`

See more details on using hashes here.

File details

Details for the file streamsplat-0.1.0-py3-none-any.whl.

File metadata

Download URL: streamsplat-0.1.0-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 126.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for streamsplat-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ee7bb8fb8b2146e8ac1ed8199ab497ca3a506be275f4d12a6c63a23e4b5e63d`
MD5	`34a252550cf1780b2ba7bed1c7e63b6a`
BLAKE2b-256	`1c07769688f99b5d882e151251a499b45aa3cd6fe65da2bf1de65160360c9250`

See more details on using hashes here.

streamsplat 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

[ICLR 2026] StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

Overview

Videos

Environment Setup

Dataset Preparation

Supported Datasets

Preprocessing Depth Maps

Configure Dataset Paths

Training

Configure Accelerate

Stage 1: Train Static Encoder

Stage 2: Train Dynamic Decoder

Monitoring Training

Inference

Citation

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes