Skip to main content

A Transformers-style Python library for monocular depth estimation — inference, evaluation, and fine-tuning

Project description

depth_estimation

License PyPI Python Demo

A unified Python library for monocular depth estimation

Inference · Video & Streaming · Visualization · Fine-Tuning · Evaluation · Dataset Loading


depth_estimation is the model-definition framework for depth estimation. It provides a single, consistent API across 12 model families and 28 variants — so you can swap models, compare them, and fine-tune them without rewriting your pipeline.

It covers the full workflow end-to-end: run inference with one line, stream depth from video, visualize results, evaluate on standard benchmarks, and fine-tune on custom depth data — all with the same library.

Installation

pip install depth-estimation

See docs/dependencies.md for optional extras (CUDA, MPS, etc.).


Quickstart

The pipeline API is the fastest way to get a depth map from any image:

from depth_estimation import pipeline

pipe = pipeline("depth-estimation", model="depth-anything-v2-vitb")
result = pipe("image.jpg")

depth_map = result.depth            # np.ndarray, float32, (H, W)
colored   = result.colored_depth    # np.ndarray, uint8,   (H, W, 3)

For full control over each step — preprocessing, forward pass, postprocessing — use Auto Classes:

from depth_estimation import AutoDepthModel, AutoProcessor
import torch

model     = AutoDepthModel.from_pretrained("zoedepth")
processor = AutoProcessor.from_pretrained("zoedepth")

inputs = processor("image.jpg")
with torch.no_grad():
    depth = model(inputs["pixel_values"])
result = processor.postprocess(depth, inputs["original_sizes"])

Or from the command line:

depth-estimate predict image.jpg --model depth-anything-v2-vitb

Why use depth_estimation?

1. One API, every model. Switch from Depth Anything to DepthPro to MoGe by changing a single string. Preprocessing, postprocessing, and output format are identical across all models.

2. The full depth workflow in one place. Most libraries stop at inference. This one covers training, evaluation on standard benchmarks, and dataset loading — so you don't have to stitch together separate tools.

3. Modular, single-file model design. Each model lives in one self-contained file. No hidden abstractions. If you need to understand or modify a model, there's exactly one place to look. New models self-register — AutoDepthModel and pipeline() resolve them automatically.

4. Designed for research. Trainable models with backbone freeze schedules, proper batch-level metric accumulation (no mean-of-means), and a compare() function that shows a formatted table across models.


Supported Models

12 model families · 28 variants — see docs/models.md for the full list.

All models support inference and CLI. The Trainable column indicates fine-tuning support via DepthTrainer.

Family Variants Depth type Trainable
Depth Anything v1 vits / vitb / vitl Relative
Depth Anything v2 vits / vitb / vitl Relative
Depth Anything v3 small / base / large / giant / mono / metric Relative + Metric
Depth Anything v3 Nested nested-giant-large Relative
ZoeDepth nyu / kitti Metric
MiDaS dpt-large / dpt-hybrid / beit-large Relative
Apple DepthPro Metric
Pixel-Perfect Depth Relative
Marigold-DC Relative (depth completion)
MoGe v1 vitl / v2 vitl / v2 vitb / v2 vits (+ normal variants) Metric
OmniVGGT vitl Metric
VGGT standard / commercial Metric

What can you do?

Inference — single image, batch, or video
# Single image
result = pipe("image.jpg")

# Batch
results = pipe(["img1.jpg", "img2.jpg"], batch_size=2)
# CLI — batch predict
depth-estimate predict "images/*.jpg" --model depth-anything-v2-vitb --output-dir results/
Video & Streaming — frame-by-frame depth from video, webcam, or image sequences
from depth_estimation import pipeline

pipe = pipeline("depth-estimation", model="depth-anything-v2-vitb")

# Stream a video file — yields DepthOutput per frame
for result in pipe.stream("video.mp4", temporal_smoothing=0.5):
    depth = result.depth                  # (H, W) float32
    colored = result.colored_depth        # (H, W, 3) uint8
    print(result.metadata["frame_index"])

# Webcam stream
for result in pipe.stream(0):            # device index
    ...

# Frame glob (sorted alphabetically)
for result in pipe.stream("frames/*.png"):
    ...

# Write output video to disk
pipe.process_video(
    "input.mp4",
    "output_depth.mp4",
    colormap="inferno",
    side_by_side=True,       # RGB | depth composite
    temporal_smoothing=0.5,
)
# CLI — video prediction
depth-estimate predict video.mp4 --model depth-anything-v2-vitb --output depth_video.mp4

See docs/video.md.

Visualization — depth maps, comparisons, overlays, 3D animations, error maps
from depth_estimation.viz import (
    show_depth, compare_depths, overlay_depth,
    create_anaglyph, animate_3d, plot_error_map,
)

# Display a depth result
show_depth(result, colormap="Spectral_r", title="Depth Anything V2")

# Side-by-side comparison of multiple models
compare_depths([result_v2, result_pro], labels=["DA V2", "DepthPro"], save="compare.png")

# Blend depth over RGB image
overlay = overlay_depth(image, result.depth, alpha=0.5, colormap="inferno")

# Red-cyan anaglyph stereo image
anaglyph = create_anaglyph(image, result.depth, baseline=0.065)

# Rotating 3D surface animation
animate_3d(image, result.depth, "rotation.gif", frames=60)

# Per-pixel error heatmap (requires ground truth)
plot_error_map(pred_depth, gt_depth, metric="abs_rel", save="errors.png")

See docs/viz.md.

Evaluation — standard benchmarks, custom predictions
from depth_estimation.evaluation import evaluate, compare, Evaluator

# Single model on NYU Depth V2
results = evaluate("depth-anything-v2-vitb", "nyu_depth_v2", split="test")

# Compare multiple models — prints table with best values marked (*)
compare(["depth-anything-v2-vits", "depth-anything-v2-vitb"], dataset="nyu_depth_v2")

# Accumulate metrics over your own dataloader
ev = Evaluator()
for pred, gt, mask in dataloader:
    ev.update(pred, gt, mask)
final = ev.compute()    # abs_rel, sq_rel, rmse, rmse_log, delta1/2/3

See docs/evaluation.md.

Fine-Tuning — any trainable model, any depth dataset
from depth_estimation import DepthAnythingV2Model, DepthTrainer, DepthTrainingArguments, load_dataset
from depth_estimation.data.transforms import get_train_transforms, get_val_transforms

model    = DepthAnythingV2Model.from_pretrained("depth-anything-v2-vits", for_training=True)
train_ds = load_dataset("nyu_depth_v2", split="train", transform=get_train_transforms(518))
val_ds   = load_dataset("nyu_depth_v2", split="test",  transform=get_val_transforms(518))

args = DepthTrainingArguments(output_dir="./checkpoints", num_epochs=25, batch_size=8,
                               freeze_backbone_epochs=5, mixed_precision=True)
DepthTrainer(model=model, args=args, train_dataset=train_ds, eval_dataset=val_ds).train()

Any torch.utils.data.Dataset returning pixel_values / depth_map / valid_mask works directly — no subclassing needed. See docs/training.md.

Dataset Loading — standard benchmarks, custom folders
from depth_estimation import load_dataset

ds = load_dataset("nyu_depth_v2",  split="test")                                    # auto-downloads ~2.8 GB
ds = load_dataset("diode",         split="val", scene_type="indoors")               # auto-downloads ~2.6 GB
ds = load_dataset("kitti_eigen",   split="test", root="/data/kitti")               # local path
ds = load_dataset("folder",        image_dir="rgb/", depth_dir="depth/")           # any folder

See docs/data.md.


Adding a New Model

  1. Create src/depth_estimation/models/your_model/
  2. Add configuration_your_model.py (inherit BaseDepthConfig)
  3. Add modeling_your_model.py (inherit BaseDepthModel, single file)
  4. Add __init__.py with MODEL_REGISTRY.register(...)

AutoDepthModel, AutoProcessor, and pipeline() resolve the new model automatically. See docs/adding_a_model.md for a step-by-step guide.


Acknowledgments

This library builds upon the work of 12 research teams — see docs/models.md#citations for the full list.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

depth_estimation-0.1.1.tar.gz (172.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

depth_estimation-0.1.1-py3-none-any.whl (183.6 kB view details)

Uploaded Python 3

File details

Details for the file depth_estimation-0.1.1.tar.gz.

File metadata

  • Download URL: depth_estimation-0.1.1.tar.gz
  • Upload date:
  • Size: 172.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for depth_estimation-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f1094c7997a9ca304d4a5b9f4798a4c49dfddd7808037ef6a68fbbb41da73167
MD5 50d1a0158b30b2b873aa009fdf89c0d6
BLAKE2b-256 59208347bf6a4408f838dd9108c31bc4cd6c7c5f2e70e2059f8a5cb37e222ad0

See more details on using hashes here.

Provenance

The following attestation bundles were made for depth_estimation-0.1.1.tar.gz:

Publisher: python-publish.yml on shriarul5273/depth_estimation

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file depth_estimation-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for depth_estimation-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dde73f47514bbb89d22cb0681b98cdd3585fb4f2cba74173d8fdb34380605b7c
MD5 9a261e8bc81116a2ffeb781e69ce62ec
BLAKE2b-256 1fd0f476cb99324bef9b55d2a1977730ae698496f5fbe619dea89deadcab3920

See more details on using hashes here.

Provenance

The following attestation bundles were made for depth_estimation-0.1.1-py3-none-any.whl:

Publisher: python-publish.yml on shriarul5273/depth_estimation

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page