Skip to main content

ImageAtlas: A toolkit for organizing, cleaning and analysing your image datasets.

Project description

ImageAtlas

PyPI Downloads

Overview

ImageAtlas is a comprehensive toolkit designed to organize, clean, and analyze image datasets.

⚠️ Note: ImageAtlas is currently in active development. The current version focuses on clustering and visualization functionality, with additional features coming soon.

Perfect for dataset curation, duplicate detection, quality control, and exploratory data analysis.

📦 Installation

Basic Installation

pip install imageatlas

Full Installation

pip install imageatlas[full]

Note on CLIP: If you wish to use the CLIP model, you must install it manually from GitHub using:

pip install git+https://github.com/openai/CLIP.git

From Source

git clone https://github.com/ahmadjaved97/ImageAtlas.git
cd ImageAtlas
pip install -e .

🚀 Quick Start

Minimal Working Example

import os
from imageatlas import ImageClusterer

# Initialize clusterer
clusterer = ImageClusterer(
    model='dinov2',           # State-of-the-art features
    clustering_method='kmeans',
    n_clusters=10,
    device='cuda'             # or 'cpu'
)

# Run clustering on your images
results = clusterer.fit("./path/to/images")

# Save results to JSON
results.to_json("./output/clustering_results.json")

# Create visual grids for each cluster
results.create_grids(
    image_dir="./path/to/images",
    output_dir="./output/grids"
)

# Organize images into cluster folders
results.create_cluster_folders(
    image_dir="./path/to/images",
    output_dir="./output/clusters"
)

That's it! Your images are now clustered, visualized, and organized.

Available Models & Algorithms

Feature Extraction Models

Model Variants
DINOv2 vits14, vitb14, vitl14, vitg14
ViT b_16, b_32, l_16, l_32, h_14
ResNet 18, 34, 50, 101, 152
EfficientNet s, m, l
CLIP RN50, RN101, ViT-B/32, ViT-B/16, ViT-L/14
ConvNeXt tiny, small, base, large
Swin t, s, b, v2_t, v2_s, v2_b
MobileNetV3 small, large
VGG16 -

Clustering Algorithms

Algorithm Parameters
K-Means n_clusters
HDBSCAN min_cluster_size, min_samples
GMM n_components, covariance_type

Dimensionality Reduction

Method Parameters
PCA n_components, whiten
UMAP n_components, n_neighbors, min_dist
t-SNE(in development) n_components, perplexity

📝 Citation

If you use ImageAtlas in your research, please cite:

@software{imageatlas2024,
  author = {Javed, Ahmad},
  title = {ImageAtlas: A Toolkit for Organizing and Analyzing Image Datasets},
  year = {2024},
  url = {https://github.com/ahmadjaved97/ImageAtlas}
}

Acknowledgments

Sample Output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imageatlas-0.1.2.tar.gz (46.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imageatlas-0.1.2-py3-none-any.whl (48.1 kB view details)

Uploaded Python 3

File details

Details for the file imageatlas-0.1.2.tar.gz.

File metadata

  • Download URL: imageatlas-0.1.2.tar.gz
  • Upload date:
  • Size: 46.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for imageatlas-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a7bd01aa6f30b73c4c4fbd6ac0ae53c54bbeb6fbecb6b4503c0a03d751e6e539
MD5 c84ab2c81d5113eb13fc58afd4ccb8e0
BLAKE2b-256 c5b15ef9e1468cfb8cfcd69d8c411646ca9af0fbd0a0bddf3000df06c75bbeec

See more details on using hashes here.

File details

Details for the file imageatlas-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: imageatlas-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 48.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for imageatlas-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8cac703d309590259b922c37def498b4066ca13b14951b15fbf4eb72c30bb3a2
MD5 6d254b7b46c2099a544cc9c1560a5c1b
BLAKE2b-256 a7cce4e3ff89d7f3fde22cd06167ffe0831dda27359f5e924bbd346cd9310f18

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page