Skip to main content

ImageAtlas: A toolkit for organizing, cleaning and analysing your image datasets.

Project description

ImageAtlas

Overview

ImageAtlas is a comprehensive toolkit designed to organize, clean, and analyze image datasets.

⚠️ Note: ImageAtlas is currently in active development. The current version focuses on clustering and visualization functionality, with additional features coming soon.

Perfect for dataset curation, duplicate detection, quality control, and exploratory data analysis.

📦 Installation

Basic Installation

pip install imageatlas

Full Installation

pip install imageatlas[full]

From Source

git clone https://github.com/ahmadjaved97/ImageAtlas.git
cd ImageAtlas
pip install -e .

🚀 Quick Start

Minimal Working Example

import os
from imageatlas import ImageClusterer

# Initialize clusterer
clusterer = ImageClusterer(
    model='dinov2',           # State-of-the-art features
    clustering_method='kmeans',
    n_clusters=10,
    device='cuda'             # or 'cpu'
)

# Run clustering on your images
results = clusterer.fit("./path/to/images")

# Save results to JSON
results.to_json("./output/clustering_results.json")

# Create visual grids for each cluster
results.create_grids(
    image_dir="./path/to/images",
    output_dir="./output/grids"
)

# Organize images into cluster folders
results.create_cluster_folders(
    image_dir="./path/to/images",
    output_dir="./output/clusters"
)

That's it! Your images are now clustered, visualized, and organized.

Available Models & Algorithms

Feature Extraction Models

Model Variants
DINOv2 vits14, vitb14, vitl14, vitg14
ViT b_16, b_32, l_16, l_32, h_14
ResNet 18, 34, 50, 101, 152
EfficientNet s, m, l
CLIP RN50, RN101, ViT-B/32, ViT-B/16, ViT-L/14
ConvNeXt tiny, small, base, large
Swin t, s, b, v2_t, v2_s, v2_b
MobileNetV3 small, large
VGG16 -

Clustering Algorithms

Algorithm Parameters
K-Means n_clusters
HDBSCAN min_cluster_size, min_samples
GMM n_components, covariance_type

Dimensionality Reduction

Method Parameters
PCA n_components, whiten
UMAP n_components, n_neighbors, min_dist
t-SNE(in development) n_components, perplexity

📝 Citation

If you use ImageAtlas in your research, please cite:

@software{imageatlas2024,
  author = {Javed, Ahmad},
  title = {ImageAtlas: A Toolkit for Organizing and Analyzing Image Datasets},
  year = {2024},
  url = {https://github.com/ahmadjaved97/ImageAtlas}
}

Acknowledgments

Sample Output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imageatlas-0.1.0.tar.gz (44.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imageatlas-0.1.0-py3-none-any.whl (45.1 kB view details)

Uploaded Python 3

File details

Details for the file imageatlas-0.1.0.tar.gz.

File metadata

  • Download URL: imageatlas-0.1.0.tar.gz
  • Upload date:
  • Size: 44.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for imageatlas-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6b616a38197d51e9bdb1df97eb131e493cf84b76f21480fa1eceb84ea63a77cc
MD5 385c419c76c14c3962a1ecc21e37ea62
BLAKE2b-256 49426856c46325c625c65d8abab8c6b076ac7f01ca16fcd651a062ce9a4123c1

See more details on using hashes here.

File details

Details for the file imageatlas-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: imageatlas-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 45.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for imageatlas-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d6583616dae32a4e4e6f27245c79806dbbe289e7d5a51c59df095490dbd91f8
MD5 39e6ac734408a080076591df34ed1ff7
BLAKE2b-256 cad5cea7c2439e8f43ff10371c3dde1780c434d420e474847b5e8b763f6727dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page