Skip to main content

SOTA unsupervised auto-annotation SDK for image classification

Project description

AutoAnnotate-Vision 🎯

State-of-the-art unsupervised auto-annotation SDK for image classification with GUI

Tests Python 3.10+ License: MIT Code style: black

AutoAnnotate-Vision automatically clusters and organizes unlabeled image datasets using cutting-edge vision models (CLIP, DINOv2, SigLIP2). It features a GUI and interactive HTML preview with Plotly for visual cluster inspection, as well as a CLI tool.

✨ Features

  • 🎨 Graphical User Interface: Easy folder browsers and visual controls
  • 🖼️ HTML Image Preview: View cluster samples in browse before labeling
  • 🤖 SOTA Vision Models: CLIP, DINOv2, DINOv2-Large, SigLIP2
  • 🔬 Multiple Clustering: K-means, Spectral, DBSCAN, HDBSCAN (optional)
  • 📁 Smart Organization: Preserves original filenames
  • ✂️ Auto Splits: Train/val/test dataset splitting
  • 💾 Export: CSV, JSON formats
  • 🔌 Python API: Full programmatic control

🚀 Installation

pip install autoannotate-vision

🎨 Quick Start - GUI

The easiest and most simplified way to use AutoAnnotate-Vision:

autoannotate-images

Note: Windows users need to have the latest C++ Redistributable installed which can be found here

Workflow:

  1. 📁 Select input folder with images
  2. 📂 Select output folder
  3. 🔢 Set number of classes
  4. 🤖 Choose model (SigLIP2 or DINOv2 recommended)
  5. ▶️ Click "Start Auto-Annotation"

The app will cluster images and open HTML previews in your browser showing sample images from each cluster for easy labeling!

💻 CLI Usage

For extra commands and utilities.

autoannotate-images-cli annotate /path/to/images /path/to/output \
    --n-clusters 10 \
    --method kmeans \
    --model siglip2 \
    --create-splits

Available models: clip, dinov2, dinov2-large, siglip2

Command Arguments

The autoannotate-images-cli annotate command accepts the following arguments:

Required Arguments:

  • INPUT_DIR - Path to the directory containing images to annotate
  • OUTPUT_DIR - Path where annotated images and metadata will be saved

Optional Arguments:

  • -n, --n-clusters INTEGER - Number of clusters to create (required for kmeans/spectral methods)
  • -m, --method [kmeans|hdbscan|spectral|dbscan] - Clustering algorithm to use (default: kmeans)
  • --model [clip|dinov2|dinov2-large|siglip2] - Vision model for embeddings (default: siglip2)
  • -b, --batch-size INTEGER - Batch size for embedding extraction (default: 32)
  • -r, --recursive - Search for images in subdirectories recursively
  • --reduce-dims / --no-reduce-dims - Apply dimensionality reduction before clustering
  • --n-samples INTEGER - Number of representative samples per cluster for preview
  • --copy / --symlink - Copy image files or create symbolic links (default: copy)
  • --create-splits - Automatically create train/val/test dataset splits
  • --export-format [csv|json] - Format for exporting labels (default: csv)

Examples:

# Basic usage with 5 clusters
autoannotate-images-cli annotate ./my_images ./output --n-clusters 5

# Use DBSCAN
autoannotate-images-cli annotate ./my_images ./output --method dbscan

# Use larger batch size with dimensionality reduction
autoannotate-images-cli annotate ./my_images ./output \
    --n-clusters 10 \
    --batch-size 64 \
    --reduce-dims

🐍 Python API

from autoannotate import AutoAnnotator

annotator = AutoAnnotator(
    input_dir="./images",
    output_dir="./output",
    model="siglip2",  # or "dinov2", "dinov2-large", "clip"
    clustering_method="kmeans",
    n_clusters=5,
    batch_size=32
)

result = annotator.run_full_pipeline(create_splits=True)
print(f"Processed {result['n_images']} images into {result['n_clusters']} classes")

Available models: clip, dinov2, dinov2-large, siglip2 Available clustering methods: kmeans, hdbscan, spectral, dbscan

📁 Output Structure

output/
├── metadata.json
├── labels.csv
├── cats/              # Your class names
│   ├── IMG_001.jpg   # Original filenames preserved!
│   └── ...
├── dogs/
└── splits/            # train/val/test. Availabe only through CLI --create-splits
    ├── train/
    ├── val/
    └── test/

🧠 Model Comparison

Model Speed Quality Notes
CLIP ⚡⚡ ⭐⭐⭐ General-purpose, good for diverse datasets
DINOv2 ⚡⚡⚡ ⭐⭐⭐⭐ Fast, self-supervised, excellent for objects
DINOv2-Large ⭐⭐⭐⭐⭐ Best quality, slower, great for fine details
SigLIP2 ⚡⚡ ⭐⭐⭐⭐⭐ Latest Google model - Recommended 🌟

Recommendation: Start with SigLIP2 for best results, or DINOv2 for faster processing.

🔧 Features

  • Fast Image Processing: All models use optimized processors (use_fast=True) for better performance
  • Normalized Embeddings: All embeddings are L2-normalized for consistent similarity measurements
  • Batch Processing: Efficient batch processing with configurable batch sizes
  • GPU Support: Automatic GPU detection and usage when available
  • Progress Tracking: Real-time progress bars for all operations
  • HTML Previews: Interactive HTML preview for visual cluster inspection before labeling

🤝 Contributing

  1. Fork the repository
  2. Create feature branch
  3. All actions from tests.yml should pass
  4. Push and create PR

📄 License

MIT License - see LICENSE file.

🙏 Acknowledgments

Built with PyTorch, Transformers, scikit-learn and more. Vision models: CLIP, DINOv2, SigLIP2.

Made for the RAIDO Project, from MetaMind Innovations


Sister Project: AutoAnnotate-Timeseries - For time series auto-annotation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoannotate_vision-0.1.2.tar.gz (37.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoannotate_vision-0.1.2-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file autoannotate_vision-0.1.2.tar.gz.

File metadata

  • Download URL: autoannotate_vision-0.1.2.tar.gz
  • Upload date:
  • Size: 37.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for autoannotate_vision-0.1.2.tar.gz
Algorithm Hash digest
SHA256 37fb54e2ffd4106b20a082a1466ab57f0c5b2fd2f64bcb05f21e2ce7a7c0e54c
MD5 d1cf2a2c00f12bb7b26c3353276dbdbe
BLAKE2b-256 a0564074d2e0788f5f0beea7e851c8db899e47fb9b3e3c07aca8b83e47ea006b

See more details on using hashes here.

File details

Details for the file autoannotate_vision-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for autoannotate_vision-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd372aaf0de31a37dad88e3c37af7675d01d0bd954954a69736ed93177dc5922
MD5 c7e7cd538bbb950fe9987c603b76da2a
BLAKE2b-256 c11b80888050cb7c0fb45269a4f9288b356bf25b601e40554ea7e71424f6efa7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page