A Transformers-style Python library for monocular depth estimation — inference, evaluation, and fine-tuning

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

shriarul5273

These details have not been verified by PyPI

Project description

depth_estimation

A unified Python library for monocular depth estimation

Inference · Video & Streaming · Visualization · Fine-Tuning · Evaluation · Dataset Loading

depth_estimation is the model-definition framework for depth estimation. It provides a single, consistent API across 12 model families and 28 variants — so you can swap models, compare them, and fine-tune them without rewriting your pipeline.

It covers the full workflow end-to-end: run inference with one line, stream depth from video, visualize results, evaluate on standard benchmarks, and fine-tune on custom depth data — all with the same library.

Installation

pip install depth-estimation

See docs/dependencies.md for optional extras (CUDA, MPS, etc.).

Quickstart

The pipeline API is the fastest way to get a depth map from any image:

from depth_estimation import pipeline

pipe = pipeline("depth-estimation", model="depth-anything-v2-vitb")
result = pipe("image.jpg")

depth_map = result.depth            # np.ndarray, float32, (H, W)
colored   = result.colored_depth    # np.ndarray, uint8,   (H, W, 3)

For full control over each step — preprocessing, forward pass, postprocessing — use Auto Classes:

from depth_estimation import AutoDepthModel, AutoProcessor
import torch

model     = AutoDepthModel.from_pretrained("zoedepth")
processor = AutoProcessor.from_pretrained("zoedepth")

inputs = processor("image.jpg")
with torch.no_grad():
    depth = model(inputs["pixel_values"])
result = processor.postprocess(depth, inputs["original_sizes"])

Or from the command line:

depth-estimate predict image.jpg --model depth-anything-v2-vitb

Why use depth_estimation?

1. One API, every model. Switch from Depth Anything to DepthPro to MoGe by changing a single string. Preprocessing, postprocessing, and output format are identical across all models.

2. The full depth workflow in one place. Most libraries stop at inference. This one covers training, evaluation on standard benchmarks, and dataset loading — so you don't have to stitch together separate tools.

3. Modular, single-file model design. Each model lives in one self-contained file. No hidden abstractions. If you need to understand or modify a model, there's exactly one place to look. New models self-register — AutoDepthModel and pipeline() resolve them automatically.

4. Designed for research. Trainable models with backbone freeze schedules, proper batch-level metric accumulation (no mean-of-means), and a compare() function that shows a formatted table across models.

Supported Models

12 model families · 28 variants — see docs/models.md for the full list.

All models support inference and CLI. The Trainable column indicates fine-tuning support via DepthTrainer.

Family	Variants	Depth type	Trainable
Depth Anything v1	vits / vitb / vitl	Relative	✅
Depth Anything v2	vits / vitb / vitl	Relative	✅
Depth Anything v3	small / base / large / giant / mono / metric	Relative + Metric	✅
Depth Anything v3 Nested	nested-giant-large	Relative	✅
ZoeDepth	nyu / kitti	Metric	❌
MiDaS	dpt-large / dpt-hybrid / beit-large	Relative	✅
Apple DepthPro	—	Metric	✅
Pixel-Perfect Depth	—	Relative	❌
Marigold-DC	—	Relative (depth completion)	❌
MoGe	v1 vitl / v2 vitl / v2 vitb / v2 vits (+ normal variants)	Metric	❌
OmniVGGT	vitl	Metric	✅
VGGT	standard / commercial	Metric	✅

What can you do?

Inference — single image, batch, or video

# Single image
result = pipe("image.jpg")

# Batch
results = pipe(["img1.jpg", "img2.jpg"], batch_size=2)

# CLI — batch predict
depth-estimate predict "images/*.jpg" --model depth-anything-v2-vitb --output-dir results/

Video & Streaming — frame-by-frame depth from video, webcam, or image sequences

from depth_estimation import pipeline

pipe = pipeline("depth-estimation", model="depth-anything-v2-vitb")

# Stream a video file — yields DepthOutput per frame
for result in pipe.stream("video.mp4", temporal_smoothing=0.5):
    depth = result.depth                  # (H, W) float32
    colored = result.colored_depth        # (H, W, 3) uint8
    print(result.metadata["frame_index"])

# Webcam stream
for result in pipe.stream(0):            # device index
    ...

# Frame glob (sorted alphabetically)
for result in pipe.stream("frames/*.png"):
    ...

# Write output video to disk
pipe.process_video(
    "input.mp4",
    "output_depth.mp4",
    colormap="inferno",
    side_by_side=True,       # RGB | depth composite
    temporal_smoothing=0.5,
)

# CLI — video prediction
depth-estimate predict video.mp4 --model depth-anything-v2-vitb --output depth_video.mp4

See docs/video.md.

Visualization — depth maps, comparisons, overlays, 3D animations, error maps

from depth_estimation.viz import (
    show_depth, compare_depths, overlay_depth,
    create_anaglyph, animate_3d, plot_error_map,
)

# Display a depth result
show_depth(result, colormap="Spectral_r", title="Depth Anything V2")

# Side-by-side comparison of multiple models
compare_depths([result_v2, result_pro], labels=["DA V2", "DepthPro"], save="compare.png")

# Blend depth over RGB image
overlay = overlay_depth(image, result.depth, alpha=0.5, colormap="inferno")

# Red-cyan anaglyph stereo image
anaglyph = create_anaglyph(image, result.depth, baseline=0.065)

# Rotating 3D surface animation
animate_3d(image, result.depth, "rotation.gif", frames=60)

# Per-pixel error heatmap (requires ground truth)
plot_error_map(pred_depth, gt_depth, metric="abs_rel", save="errors.png")

See docs/viz.md.

Evaluation — standard benchmarks, custom predictions

from depth_estimation.evaluation import evaluate, compare, Evaluator

# Single model on NYU Depth V2
results = evaluate("depth-anything-v2-vitb", "nyu_depth_v2", split="test")

# Compare multiple models — prints table with best values marked (*)
compare(["depth-anything-v2-vits", "depth-anything-v2-vitb"], dataset="nyu_depth_v2")

# Accumulate metrics over your own dataloader
ev = Evaluator()
for pred, gt, mask in dataloader:
    ev.update(pred, gt, mask)
final = ev.compute()    # abs_rel, sq_rel, rmse, rmse_log, delta1/2/3

See docs/evaluation.md.

Fine-Tuning — any trainable model, any depth dataset

from depth_estimation import DepthAnythingV2Model, DepthTrainer, DepthTrainingArguments, load_dataset
from depth_estimation.data.transforms import get_train_transforms, get_val_transforms

model    = DepthAnythingV2Model.from_pretrained("depth-anything-v2-vits", for_training=True)
train_ds = load_dataset("nyu_depth_v2", split="train", transform=get_train_transforms(518))
val_ds   = load_dataset("nyu_depth_v2", split="test",  transform=get_val_transforms(518))

args = DepthTrainingArguments(output_dir="./checkpoints", num_epochs=25, batch_size=8,
                               freeze_backbone_epochs=5, mixed_precision=True)
DepthTrainer(model=model, args=args, train_dataset=train_ds, eval_dataset=val_ds).train()

Any torch.utils.data.Dataset returning pixel_values / depth_map / valid_mask works directly — no subclassing needed. See docs/training.md.

Dataset Loading — standard benchmarks, custom folders

from depth_estimation import load_dataset

ds = load_dataset("nyu_depth_v2",  split="test")                                    # auto-downloads ~2.8 GB
ds = load_dataset("diode",         split="val", scene_type="indoors")               # auto-downloads ~2.6 GB
ds = load_dataset("kitti_eigen",   split="test", root="/data/kitti")               # local path
ds = load_dataset("folder",        image_dir="rgb/", depth_dir="depth/")           # any folder

See docs/data.md.

Adding a New Model

Create src/depth_estimation/models/your_model/
Add configuration_your_model.py (inherit BaseDepthConfig)
Add modeling_your_model.py (inherit BaseDepthModel, single file)
Add __init__.py with MODEL_REGISTRY.register(...)

AutoDepthModel, AutoProcessor, and pipeline() resolve the new model automatically. See docs/adding_a_model.md for a step-by-step guide.

Acknowledgments

This library builds upon the work of 12 research teams — see docs/models.md#citations for the full list.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

shriarul5273

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Mar 26, 2026

0.1.0

Mar 16, 2026

0.0.9

Mar 15, 2026

0.0.8

Mar 13, 2026

0.0.7

Mar 12, 2026

0.0.6

Mar 10, 2026

0.0.5

Mar 7, 2026

0.0.4

Mar 7, 2026

0.0.3

Mar 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

depth_estimation-0.1.1.tar.gz (172.0 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

depth_estimation-0.1.1-py3-none-any.whl (183.6 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file depth_estimation-0.1.1.tar.gz.

File metadata

Download URL: depth_estimation-0.1.1.tar.gz
Upload date: Mar 26, 2026
Size: 172.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for depth_estimation-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`f1094c7997a9ca304d4a5b9f4798a4c49dfddd7808037ef6a68fbbb41da73167`
MD5	`50d1a0158b30b2b873aa009fdf89c0d6`
BLAKE2b-256	`59208347bf6a4408f838dd9108c31bc4cd6c7c5f2e70e2059f8a5cb37e222ad0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for depth_estimation-0.1.1.tar.gz:

Publisher: python-publish.yml on shriarul5273/depth_estimation

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: depth_estimation-0.1.1.tar.gz
- Subject digest: f1094c7997a9ca304d4a5b9f4798a4c49dfddd7808037ef6a68fbbb41da73167
- Sigstore transparency entry: 1186325976
- Sigstore integration time: Mar 26, 2026
Source repository:
- Permalink: shriarul5273/depth_estimation@158e60580c5050155a656f89bf9c5c65a0e31ca9
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/shriarul5273
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@158e60580c5050155a656f89bf9c5c65a0e31ca9
- Trigger Event: release

File details

Details for the file depth_estimation-0.1.1-py3-none-any.whl.

File metadata

Download URL: depth_estimation-0.1.1-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 183.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for depth_estimation-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dde73f47514bbb89d22cb0681b98cdd3585fb4f2cba74173d8fdb34380605b7c`
MD5	`9a261e8bc81116a2ffeb781e69ce62ec`
BLAKE2b-256	`1fd0f476cb99324bef9b55d2a1977730ae698496f5fbe619dea89deadcab3920`

See more details on using hashes here.

Provenance

The following attestation bundles were made for depth_estimation-0.1.1-py3-none-any.whl:

Publisher: python-publish.yml on shriarul5273/depth_estimation

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: depth_estimation-0.1.1-py3-none-any.whl
- Subject digest: dde73f47514bbb89d22cb0681b98cdd3585fb4f2cba74173d8fdb34380605b7c
- Sigstore transparency entry: 1186325979
- Sigstore integration time: Mar 26, 2026
Source repository:
- Permalink: shriarul5273/depth_estimation@158e60580c5050155a656f89bf9c5c65a0e31ca9
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/shriarul5273
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@158e60580c5050155a656f89bf9c5c65a0e31ca9
- Trigger Event: release

depth-estimation 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

depth_estimation

A unified Python library for monocular depth estimation

Inference · Video & Streaming · Visualization · Fine-Tuning · Evaluation · Dataset Loading

Installation

Quickstart

Why use depth_estimation?

Supported Models

What can you do?

Adding a New Model

Acknowledgments

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance