Skip to main content

Command-line tool for extracting DINO features for images and videos

Project description

🦕 DINOtool

DINOtool is a simple Python package that makes it easy to extract and visualize features from images and videos using DINOv2 models. DINOtool helps you generate frame and patch-level embeddings with a single command.

✨ Features

  • 📷 Extract DINO features from:
    • Single images
    • Video files (.mp4, .avi, etc.)
    • Folders containing image sequences
  • 🌈 Automatically generates PCA visualizations of the features
  • 🧠 Visuals include side-by-side view of the original frame and the feature map
  • Saves features for downstream tasks
  • ⚡ Command-line interface for easy, no-code operation

📦 Installation

Basic install (Linux/WSL2)

Install via pip:

pip install dinotool

You'll also need to have ffmpeg installed:

sudo apt install ffmpeg

You can check that dinotool is properly installed by testing it on an image:

dinotool test.jpg -o out.jpg

🐍 Conda Environment (Recommended)

If you want an isolated setup, especially useful for managing ffmpeg and dependencies:

Install Miniforge.

conda create -n dinotool python=3.12
conda activate dinotool
conda install -c conda-forge ffmpeg
pip install dinotool

Windows notes:

  • Windows is supported only for CPU usage. If you want GPU support on Windows, we recommend using WSL2 + Ubuntu.
  • The conda method above is recommended for Windows CPU setups.

🚀 Quickstart

📸 Image:

dinotool input.jpg -o output.jpg

🎞️ Video

dinotool input.mp4 -o output.mp4

📁 Folder of Images (treated as video frames)

dinotool path/to/folder/ -o output.mp4

The output is a side-by-side visualization with PCA of the patch-level features.

🧪 Advanced Options

Flag Description
--model-name Use a different DINO model (default: dinov2_vits14_reg)
--input-size W H Resize input before inference
--batch-size Batch size for processing (default: 1)
--only-pca Output only the PCA map, without side-by-side
--save-features Save extracted features: full, flat, or frame
-o, --output Output path (required)

Tips:

Increase --batch-size to the largest value your memory supports for faster processing.

dinotool input.mp4 -o output.mp4 --batch-size 16

For large videos, reduce the input size with --input-size

# Processing a HD video faster:
dinotool input.mp4 -o output.mp4 --input-size 920 540 --batch-size 16

💾 Feature extraction options

Use --save-features to export DINO features for downstream tasks.

Mode Format Output shape Best for
full .nc (image) / .zarr (video) (frames, height, width, feature) Keeps spatial structure of patches.
flat partitioned .parquet (frames * height * weight, feature) Reliable long video processing. Faster patch-level analysis
frame .parquet (frames, feature) One feature vector per frame (global content representation)

full - Spatial patch features

  • Saves full patch feature maps from the ViT (one vector per image patch).
  • Useful for reconstructing spatial attention maps or for downstream tasks like segmentation.
  • Stored as netCDF for single images, .zarr for video sequences.
  • zarr saving can be memory-intensive and might still fail for large videos.
dinotool input.mp4 -o output.mp4 --save-features full

flat - Flattened patch features

  • Saves same vectors as above, but discards 2D spatial layout and saves output in parquet format.
  • More reliable for longer videos.
  • Useful for faster computations for statistics, patch-level similarity and clustering.
dinotool input.mp4 -o output.mp4 --save-features flat

frame - Frame-level features

  • Saves one vector per frame using the [CLS] token from DINO.
  • Useful for temporal tasks, video summarization and classification.
  • For image input saves a .txt file with a single vector
  • For video input saves a .parquet file with one row per frame.
# For a video
dinotool input.mp4 -o output.mp4 --save-features frame

# For an image
dinotool input.jpg -o output.jpg --save-features frame

🧑‍💻 Usage reference

🦕 DINOtool: Extract and visualize DINO features from images and videos.

Usage:
  dinotool input_path -o output_path [options]

Arguments:
  input                   Path to image, video file, or folder of frames.
  -o, --output            Path for the output (required).

Options:
  --model-name MODEL      DINO model to use (default: dinov2_vits14_reg)
  --input-size W H        Resize input before processing
  --batch-size N          Batch size for inference
  --only-pca              Only visualize PCA features
  --save-features MODE    Save extracted features: full, flat, or frame

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dinotool-0.1.0.tar.gz (9.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dinotool-0.1.0-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file dinotool-0.1.0.tar.gz.

File metadata

  • Download URL: dinotool-0.1.0.tar.gz
  • Upload date:
  • Size: 9.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dinotool-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b837f86c7ecbf131d10c28146f4d5d95cc8982ee38e9cd7c9fd1917015c5dd41
MD5 84438c5f793f07d4fd7672012fad302b
BLAKE2b-256 2664c954c8ad1cab7167b446d7cccdb3df67f7429ad06dd0e1c79d7be878e21f

See more details on using hashes here.

File details

Details for the file dinotool-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dinotool-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dinotool-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d2a5f4384908151f5d0ebe4fc13e08a0dc628ef6b4333d51d8d3d00c98e25dab
MD5 2ee19cc478f89ecce1849e42d8a575e2
BLAKE2b-256 a515f7b1381004682b0f949e0994fbf901f3aa002126367acbd0e1760275cb4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page