Skip to main content

Feed-forward framework for online dynamic 3D reconstruction from uncalibrated video streams

Project description

[ICLR 2026] StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

arXiv

results

Overview

StreamSplat is a fully feed-forward framework that instantly transforms uncalibrated video streams of arbitrary length into dynamic 3D Gaussian Splatting representations in an online manner.

  • Feed-forward inference: No per-scene optimization required
  • Camera-free: Works directly with uncalibrated monocular videos
  • Dynamic scene modeling: Handles both static and dynamic scene elements through polynomial motion modeling
  • Probabilistic Gaussian prediction: Uses truncated Gaussian models for robust Gaussian position modeling
  • Two-stage training: Stage 1 trains the static encoder, Stage 2 trains the dynamic decoder

Videos

https://github.com/user-attachments/assets/d72b75de-e07a-4d81-a23f-f85e99c9bd05

https://github.com/user-attachments/assets/d975b425-1e4c-4dc9-8398-0a1a43d50e67

https://github.com/user-attachments/assets/d2a064f5-f4d7-46c4-8a3a-28fcb8679df5

https://github.com/user-attachments/assets/466a222d-f3b5-447a-84c0-a47538071c05

Environment Setup

  1. Create conda environment:
conda env create -f environment.yml
conda activate StreamSplat
  1. Build the differentiable Gaussian rasterizer:
cd submodules/diff-gaussian-rasterization-orth
pip install .
  1. Download pretrained depth model:

Download Depth Anything V2 checkpoint and place it in the checkpoints/ directory:

mkdir -p checkpoints
# Download depth_anything_v2_vitl.pth from https://github.com/DepthAnything/Depth-Anything-V2
# Place it in checkpoints/depth_anything_v2_vitl.pth

Dataset Preparation

StreamSplat supports training on multiple datasets. All datasets require pre-computed depth maps using Depth Anything V2.

Supported Datasets

Dataset Type Description
RealEstate10K Static Real estate videos
CO3Dv2 Static Object-centric multi-view
DAVIS Dynamic High-quality videos
YouTube-VOS Dynamic Large-scale videos

Preprocessing Depth Maps

Use the provided script to preprocess depth maps for DAVIS (similar scripts can be adapted for other datasets):

python preprocess_depth_davis.py --root_path /path/to/davis

Configure Dataset Paths

Edit configs/options.py and configs/options_decoder.py to set dataset paths:

root_path_re10k: str = "/path/to/re10k"
root_path_co3d: str = "/path/to/co3d"
root_path_davis: str = "/path/to/davis"
root_path_vos: str = "/path/to/youtube-vos"

Training

Configure Accelerate

Create an accelerate config file (or use the provided acc_configs/gpu8.yaml):

accelerate config

Stage 1: Train Static Encoder

Train the static encoder on combined datasets:

accelerate launch --config_file acc_configs/gpu8.yaml train.py combined \
    --workspace /path/to/workspace/encoder_exp

Stage 2: Train Dynamic Decoder

After Stage 1 completes, train the dynamic decoder with the frozen encoder:

accelerate launch --config_file acc_configs/gpu8.yaml train_decoder.py combined \
    --workspace /path/to/workspace/decoder_exp \
    --encoder_path /path/to/workspace/encoder_exp/model.safetensors

Monitoring Training

Training progress is logged to Weights & Biases. Set up wandb before training:

wandb login

Checkpoints are saved every 10 epochs and every 30 minutes to checkpoint_latest/.

Inference

Download our pretrained checkpoint at Google Drive and place it in the checkpoints/ directory:

python splat_inference.py \
    --resume checkpoints/streamsplat.safetensors \
    --input_frames_path=/path/to/rgb_frames \
    --input_depths_path=/path/to/depth_maps 

Citation

If you find this work useful, please cite:

@article{wu2025streamsplat,
    title={StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams}, 
    author={Zike Wu and Qi Yan and Xuanyu Yi and Lele Wang and Renjie Liao},
    journal={arXiv preprint arXiv:2506.08862},
    year={2025},
}

Acknowledgments

This project builds upon several excellent works:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamsplat-0.1.0.tar.gz (94.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

streamsplat-0.1.0-py3-none-any.whl (126.5 kB view details)

Uploaded Python 3

File details

Details for the file streamsplat-0.1.0.tar.gz.

File metadata

  • Download URL: streamsplat-0.1.0.tar.gz
  • Upload date:
  • Size: 94.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for streamsplat-0.1.0.tar.gz
Algorithm Hash digest
SHA256 91aad70ced179eaf78c64522672ba96ce18b4ad44d4cb2f9f8a7b84d303fe21a
MD5 d17c6dc6b2246cfba579c264c488f13d
BLAKE2b-256 67514da3994b80341b5a03f43dc86dfb2a24fca34011791c2897d3ff438f5947

See more details on using hashes here.

File details

Details for the file streamsplat-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: streamsplat-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 126.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for streamsplat-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ee7bb8fb8b2146e8ac1ed8199ab497ca3a506be275f4d12a6c63a23e4b5e63d
MD5 34a252550cf1780b2ba7bed1c7e63b6a
BLAKE2b-256 1c07769688f99b5d882e151251a499b45aa3cd6fe65da2bf1de65160360c9250

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page