Skip to main content

ImageAtlas: A toolkit for organizing, cleaning and analysing your image datasets.

Project description

ImageAtlas

PyPI Downloads

Overview

ImageAtlas is a comprehensive toolkit designed to organize, clean, and analyze image datasets.

⚠️ Note: ImageAtlas is currently in active development. The current version focuses on clustering and visualization functionality, with additional features coming soon.

Perfect for dataset curation, duplicate detection, quality control, and exploratory data analysis.

📦 Installation

Basic Installation

pip install imageatlas

Full Installation

pip install imageatlas[full]

Note on CLIP: If you wish to use the CLIP model, you must install it manually from GitHub using:

pip install git+https://github.com/openai/CLIP.git

From Source

git clone https://github.com/ahmadjaved97/ImageAtlas.git
cd ImageAtlas
pip install -e .

🚀 Quick Start

Minimal Working Example

import os
from imageatlas import ImageClusterer

# Initialize clusterer
clusterer = ImageClusterer(
    model='dinov2',           # State-of-the-art features
    clustering_method='kmeans',
    n_clusters=10,
    device='cuda'             # or 'cpu'
)

# Run clustering on your images
results = clusterer.fit("./path/to/images")

# Save results to JSON
results.to_json("./output/clustering_results.json")

# Create visual grids for each cluster
results.create_grids(
    image_dir="./path/to/images",
    output_dir="./output/grids"
)

# Organize images into cluster folders
results.create_cluster_folders(
    image_dir="./path/to/images",
    output_dir="./output/clusters"
)

That's it! Your images are now clustered, visualized, and organized.

Available Models & Algorithms

Feature Extraction Models

Model Variants
DINOv2 vits14, vitb14, vitl14, vitg14
ViT b_16, b_32, l_16, l_32, h_14
ResNet 18, 34, 50, 101, 152
EfficientNet s, m, l
CLIP RN50, RN101, ViT-B/32, ViT-B/16, ViT-L/14
ConvNeXt tiny, small, base, large
Swin t, s, b, v2_t, v2_s, v2_b
MobileNetV3 small, large
VGG16 -

Clustering Algorithms

Algorithm Parameters
K-Means n_clusters
HDBSCAN min_cluster_size, min_samples
GMM n_components, covariance_type

Dimensionality Reduction

Method Parameters
PCA n_components, whiten
UMAP n_components, n_neighbors, min_dist
t-SNE(in development) n_components, perplexity

📝 Citation

If you use ImageAtlas in your research, please cite:

@software{imageatlas2024,
  author = {Javed, Ahmad},
  title = {ImageAtlas: A Toolkit for Organizing and Analyzing Image Datasets},
  year = {2024},
  url = {https://github.com/ahmadjaved97/ImageAtlas}
}

Acknowledgments

Sample Output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imageatlas-0.1.1.tar.gz (45.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imageatlas-0.1.1-py3-none-any.whl (46.8 kB view details)

Uploaded Python 3

File details

Details for the file imageatlas-0.1.1.tar.gz.

File metadata

  • Download URL: imageatlas-0.1.1.tar.gz
  • Upload date:
  • Size: 45.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for imageatlas-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d7d61c38f68854eaece943305bfd1df69c24acc3e92182d37bf96bd8ca41ed64
MD5 641d2e419e2afdcd54a5bbbe207f928e
BLAKE2b-256 b1b747d0a2be98aa0826c563ac173034d7686613daa6e2eb5157b3b13a726796

See more details on using hashes here.

File details

Details for the file imageatlas-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: imageatlas-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 46.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for imageatlas-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 54f44dd60a72e936996b3c783faacfb1fb66f7cb26e3765ef475e7784aa0630b
MD5 8b82771b17aa5c561a4830cb8ba661de
BLAKE2b-256 9591e1bbf05f6fbdebd051b3c134278a73c4289e1ee3f7f774d6c0b8af7b3156

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page