Skip to main content

Feed-forward framework for online dynamic 3D reconstruction from uncalibrated video streams

Project description

[ICLR 2026] StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

arXiv

results

Overview

StreamSplat is a fully feed-forward framework that instantly transforms uncalibrated video streams of arbitrary length into dynamic 3D Gaussian Splatting representations in an online manner.

  • Feed-forward inference: No per-scene optimization required
  • Camera-free: Works directly with uncalibrated monocular videos
  • Dynamic scene modeling: Handles both static and dynamic scene elements through polynomial motion modeling
  • Probabilistic Gaussian prediction: Uses truncated Gaussian models for robust Gaussian position modeling
  • Two-stage training: Stage 1 trains the static encoder, Stage 2 trains the dynamic decoder

Videos

https://github.com/user-attachments/assets/d72b75de-e07a-4d81-a23f-f85e99c9bd05

https://github.com/user-attachments/assets/d975b425-1e4c-4dc9-8398-0a1a43d50e67

https://github.com/user-attachments/assets/d2a064f5-f4d7-46c4-8a3a-28fcb8679df5

https://github.com/user-attachments/assets/466a222d-f3b5-447a-84c0-a47538071c05

Environment Setup

  1. Create conda environment:
conda env create -f environment.yml
conda activate StreamSplat
  1. Build the differentiable Gaussian rasterizer:
cd submodules/diff-gaussian-rasterization-orth
pip install .
  1. Download pretrained depth model:

Download Depth Anything V2 checkpoint and place it in the checkpoints/ directory:

mkdir -p checkpoints
# Download depth_anything_v2_vitl.pth from https://github.com/DepthAnything/Depth-Anything-V2
# Place it in checkpoints/depth_anything_v2_vitl.pth

Dataset Preparation

StreamSplat supports training on multiple datasets. All datasets require pre-computed depth maps using Depth Anything V2.

Supported Datasets

Dataset Type Description
RealEstate10K Static Real estate videos
CO3Dv2 Static Object-centric multi-view
DAVIS Dynamic High-quality videos
YouTube-VOS Dynamic Large-scale videos

Preprocessing Depth Maps

Use the provided script to preprocess depth maps for DAVIS (similar scripts can be adapted for other datasets):

python preprocess_depth_davis.py --root_path /path/to/davis

Configure Dataset Paths

Edit configs/options.py and configs/options_decoder.py to set dataset paths:

root_path_re10k: str = "/path/to/re10k"
root_path_co3d: str = "/path/to/co3d"
root_path_davis: str = "/path/to/davis"
root_path_vos: str = "/path/to/youtube-vos"

Training

Configure Accelerate

Create an accelerate config file (or use the provided acc_configs/gpu8.yaml):

accelerate config

Stage 1: Train Static Encoder

Train the static encoder on combined datasets:

accelerate launch --config_file acc_configs/gpu8.yaml train.py combined \
    --workspace /path/to/workspace/encoder_exp

Stage 2: Train Dynamic Decoder

After Stage 1 completes, train the dynamic decoder with the frozen encoder:

accelerate launch --config_file acc_configs/gpu8.yaml train_decoder.py combined \
    --workspace /path/to/workspace/decoder_exp \
    --encoder_path /path/to/workspace/encoder_exp/model.safetensors

Monitoring Training

Training progress is logged to Weights & Biases. Set up wandb before training:

wandb login

Checkpoints are saved every 10 epochs and every 30 minutes to checkpoint_latest/.

Inference

Download our pretrained checkpoint at Google Drive and place it in the checkpoints/ directory:

python splat_inference.py \
    --resume checkpoints/streamsplat.safetensors \
    --input_frames_path=/path/to/rgb_frames \
    --input_depths_path=/path/to/depth_maps 

Citation

If you find this work useful, please cite:

@article{wu2025streamsplat,
    title={StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams}, 
    author={Zike Wu and Qi Yan and Xuanyu Yi and Lele Wang and Renjie Liao},
    journal={arXiv preprint arXiv:2506.08862},
    year={2025},
}

Acknowledgments

This project builds upon several excellent works:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamsplat-0.1.2.tar.gz (4.5 MB view details)

Uploaded Source

File details

Details for the file streamsplat-0.1.2.tar.gz.

File metadata

  • Download URL: streamsplat-0.1.2.tar.gz
  • Upload date:
  • Size: 4.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.16

File hashes

Hashes for streamsplat-0.1.2.tar.gz
Algorithm Hash digest
SHA256 13ef4febb579bbed3de1f653a21e8429b9fcd471fb6a9fe7975ca5264ca5c2bb
MD5 1de107b3501bab7132faa3c2b8e0e0aa
BLAKE2b-256 6f13d6dcb724bc9a5978f69c0687fdcd40e772ecb46c06d3936a5202d8de0c3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page