Automated deep learning image tasks powered by timm and PyTorch Lightning
Project description
Train state-of-the-art vision models with minimal code
Documentation • Quick Start • Examples • API Reference
AutoTimm combines the power of timm (1000+ pretrained models) with PyTorch Lightning for a seamless training experience. Train image classifiers, object detectors, and segmentation models with any timm backbone. Go from idea to trained model in minutes, not hours.
Highlights
| 4 Vision Tasks | Classification, Object Detection, Semantic Segmentation, Instance Segmentation |
| 1000+ Backbones | Access ResNet, EfficientNet, ViT, ConvNeXt, Swin, and more from timm |
| Hugging Face Hub | Load timm-compatible models directly from HF Hub with hf-hub: prefix |
| HF Transformers | Direct integration with HuggingFace Transformers vision models (ViT, DeiT, BEiT, Swin) |
| AutoTrainer Compatible | All HF models work with AutoTrainer (checkpointing, tuning, multi-logger, etc.) |
| Advanced Architectures | DeepLabV3+, FCOS, Mask R-CNN style heads with feature pyramids |
| Explicit Metrics | Configure exactly what you track with MetricManager and torchmetrics |
| Multi-Logger Support | TensorBoard, MLflow, Weights & Biases, CSV — use them all at once |
| Auto-Tuning | Automatic learning rate and batch size finding before training |
| Flexible Transforms | Choose between torchvision (PIL) or albumentations (OpenCV) |
| Production Ready | Mixed precision, multi-GPU, gradient accumulation out of the box |
Installation
pip install autotimm
More installation options
# With specific extras
pip install autotimm[albumentations] # OpenCV-based transforms
pip install autotimm[detection] # Object detection (includes pycocotools for mAP metrics)
pip install autotimm[segmentation] # Segmentation tasks (includes albumentations + pycocotools)
pip install autotimm[tensorboard] # TensorBoard logging
pip install autotimm[wandb] # Weights & Biases
pip install autotimm[mlflow] # MLflow tracking
# Everything
pip install autotimm[all]
# Development
git clone https://github.com/theja-vanka/AutoTimm.git
cd AutoTimm
pip install -e ".[dev,all]"
Quick Start
Image Classification
from autotimm import AutoTrainer, ImageClassifier, ImageDataModule, MetricConfig
# Data
data = ImageDataModule(
data_dir="./data",
dataset_name="CIFAR10",
image_size=224,
batch_size=64,
)
# Metrics
metrics = [
MetricConfig(
name="accuracy",
backend="torchmetrics",
metric_class="Accuracy",
params={"task": "multiclass"},
stages=["train", "val", "test"],
prog_bar=True,
),
]
# Model & Train
model = ImageClassifier(
backbone="resnet18",
num_classes=10,
metrics=metrics,
lr=1e-3,
)
trainer = AutoTrainer(max_epochs=10)
trainer.fit(model, datamodule=data)
Semantic Segmentation
from autotimm import SemanticSegmentor, SegmentationDataModule, MetricConfig
# Data
data = SegmentationDataModule(
data_dir="./cityscapes",
format="cityscapes", # or "png", "coco", "voc"
image_size=512,
batch_size=8,
)
# Metrics
metrics = [
MetricConfig(
name="iou",
backend="torchmetrics",
metric_class="JaccardIndex",
params={
"task": "multiclass",
"num_classes": 19,
"average": "macro",
"ignore_index": 255,
},
stages=["val", "test"],
prog_bar=True,
),
]
# Model & Train
model = SemanticSegmentor(
backbone="resnet50",
num_classes=19,
head_type="deeplabv3plus", # or "fcn"
loss_type="combined", # CE + Dice
dice_weight=1.0,
metrics=metrics,
)
trainer = AutoTrainer(max_epochs=100)
trainer.fit(model, datamodule=data)
Object Detection
from autotimm import ObjectDetector, DetectionDataModule, MetricConfig
# Data
data = DetectionDataModule(
data_dir="./coco",
image_size=640,
batch_size=4,
)
# Metrics
metrics = [
MetricConfig(
name="mAP",
backend="torchmetrics",
metric_class="MeanAveragePrecision",
params={"box_format": "xyxy", "iou_type": "bbox"},
stages=["val", "test"],
prog_bar=True,
),
]
# Model & Train
model = ObjectDetector(
backbone="resnet50",
num_classes=80,
metrics=metrics,
)
trainer = AutoTrainer(max_epochs=100)
trainer.fit(model, datamodule=data)
Instance Segmentation
from autotimm import InstanceSegmentor, InstanceSegmentationDataModule, MetricConfig
# Data
data = InstanceSegmentationDataModule(
data_dir="./coco",
image_size=640,
batch_size=4,
)
# Metrics
metrics = [
MetricConfig(
name="mask_mAP",
backend="torchmetrics",
metric_class="MeanAveragePrecision",
params={"box_format": "xyxy", "iou_type": "segm"},
stages=["val", "test"],
prog_bar=True,
),
]
# Model & Train
model = InstanceSegmentor(
backbone="resnet50",
num_classes=80,
mask_loss_weight=1.0,
metrics=metrics,
)
trainer = AutoTrainer(max_epochs=100)
trainer.fit(model, datamodule=data)
See the full documentation for more examples and features →
Import Styles
AutoTimm supports flexible import styles for convenience:
# Direct imports
from autotimm import SemanticSegmentor, DiceLoss, MetricConfig
# Submodule aliases (NEW!)
from autotimm.task import SemanticSegmentor, InstanceSegmentor
from autotimm.loss import DiceLoss, CombinedSegmentationLoss
from autotimm.metric import MetricConfig, MetricManager
from autotimm.head import DeepLabV3PlusHead, MaskHead
# Namespace access
import autotimm
model = autotimm.task.SemanticSegmentor(...)
loss = autotimm.loss.DiceLoss(...)
# Original imports (still supported)
from autotimm.losses import DiceLoss
from autotimm.metrics import MetricConfig
from autotimm.tasks import SemanticSegmentor
Supported Tasks & Architectures
Classification
- Models: Any timm backbone (1000+ models)
- Head: Linear classification head with dropout
- Losses: CrossEntropy with label smoothing, Mixup support
- Datasets: Torchvision datasets, ImageFolder, custom loaders
Object Detection
- Architecture: FCOS-style anchor-free detection
- Components: FPN, Detection Head (classification + bbox regression + centerness)
- Losses: Focal Loss, GIoU Loss, Centerness Loss
- Datasets: COCO format, custom annotations
Semantic Segmentation
- Architectures: DeepLabV3+ (ASPP + decoder), FCN
- Losses: CrossEntropy, Dice, Focal, Combined (CE + Dice), Tversky
- Datasets: PNG masks, COCO stuff, Cityscapes, Pascal VOC
- Metrics: IoU (Jaccard Index), pixel accuracy, per-class metrics
Instance Segmentation
- Architecture: FCOS detection + Mask R-CNN style mask head
- Components: FPN, Detection Head, Mask Head with ROI Align
- Losses: Detection losses + Binary mask loss
- Datasets: COCO instance segmentation format
- Metrics: Mask mAP, bbox mAP
Examples
Ready-to-run scripts in the examples/ directory:
| Example | Description |
|---|---|
| classify_cifar10.py | Basic classification with MetricManager and auto-tuning |
| classify_custom_folder.py | Train on your own dataset |
| huggingface_hub_models.py | Using Hugging Face Hub models with AutoTimm |
| hf_hub_*.py | Comprehensive HF Hub integration examples (classification, detection, segmentation) |
| object_detection_coco.py | FCOS-style object detection on COCO dataset |
| object_detection_transformers.py | Transformer-based detection (ViT, Swin, DeiT) |
| object_detection_rtdetr.py | RT-DETR end-to-end detection (no NMS required) |
| semantic_segmentation_cityscapes.py | DeepLabV3+ segmentation on Cityscapes |
| instance_segmentation_coco.py | Mask R-CNN style instance segmentation |
| vit_finetuning.py | Two-phase Vision Transformer fine-tuning |
| multi_gpu_training.py | Distributed training with DDP |
| mlflow_tracking.py | Experiment tracking with MLflow |
Documentation
| Section | Description |
|---|---|
| Quick Start | Get up and running in 5 minutes |
| User Guide | In-depth guides for all features |
| HF Integration Overview | Compare HF Hub timm vs HF Transformers approaches |
| HF Hub Integration | Using Hugging Face Hub models |
| HF Transformers | HuggingFace Transformers vision models with Lightning |
| API Reference | Complete API documentation |
| Examples | Runnable code examples |
Explore Backbones
import autotimm
# Search 1000+ timm models
autotimm.list_backbones("*efficientnet*", pretrained_only=True)
autotimm.list_backbones("*vit*")
# Search Hugging Face Hub models
autotimm.list_hf_hub_backbones(model_name="resnet", limit=10)
autotimm.list_hf_hub_backbones(author="facebook", model_name="convnext")
# Inspect a model
backbone = autotimm.create_backbone("convnext_tiny")
print(f"Features: {backbone.num_features}, Params: {autotimm.count_parameters(backbone):,}")
# Use models from Hugging Face Hub
hf_backbone = autotimm.create_backbone("hf-hub:timm/resnet50.a1_in1k")
print(f"HF Hub model loaded: {hf_backbone.num_features} features")
Hugging Face Hub Integration
AutoTimm seamlessly integrates with Hugging Face Hub, allowing you to use thousands of community-contributed timm models:
from autotimm import ImageClassifier, list_hf_hub_backbones
# Discover models on HF Hub
models = list_hf_hub_backbones(model_name="resnet", limit=5)
print(models)
# ['hf-hub:timm/resnet50.a1_in1k', 'hf-hub:timm/resnet18.a1_in1k', ...]
# Use HF Hub model as backbone (just add 'hf-hub:' prefix)
model = ImageClassifier(
backbone="hf-hub:timm/resnet50.a1_in1k",
num_classes=10,
)
# Works with all tasks
from autotimm import SemanticSegmentor, ObjectDetector
seg_model = SemanticSegmentor(
backbone="hf-hub:timm/convnext_tiny.fb_in22k",
num_classes=19,
)
det_model = ObjectDetector(
backbone="hf-hub:timm/efficientnet_b0.ra_in1k",
num_classes=80,
)
Three Integration Approaches
AutoTimm supports multiple ways to work with HuggingFace models:
| Approach | Best For | AutoTrainer | Integration |
|---|---|---|---|
| HF Hub timm | CNNs, Production | ✅ Full | Native |
| HF Direct | Vision Transformers | ✅ Full | Manual |
| HF Auto | Prototyping | ✅ Full | Manual |
-
HF Hub timm Models (via AutoTimm) - Recommended for CNNs
- Use timm models from HF Hub:
"hf-hub:timm/resnet50.a1_in1k" - Native AutoTimm integration
- Works with all tasks
- Use timm models from HF Hub:
-
HF Transformers Direct - Recommended for Vision Transformers
- Use specific model classes:
ViTModel,DeiTModel,BeitModel - Full control and transparency
- Manual PyTorch Lightning integration
- Use specific model classes:
-
HF Transformers Auto - For quick prototyping
- Use Auto classes:
AutoModel,AutoConfig - Quick experimentation
- Less explicit
- Use Auto classes:
Learn more about choosing the right approach →
Full AutoTrainer Support
All HuggingFace integration approaches work seamlessly with AutoTimm's AutoTrainer, including:
- ✅ Checkpoint monitoring and saving
- ✅ Early stopping callbacks
- ✅ Gradient accumulation
- ✅ Mixed precision training
- ✅ Automatic LR and batch size finding
- ✅ Multiple logger support
- ✅ ImageDataModule integration
from autotimm import AutoTrainer, ImageClassifier, ImageDataModule
import pytorch_lightning as pl
model = ImageClassifier(
backbone="hf-hub:timm/convnext_base.fb_in22k_ft_in1k",
num_classes=100,
)
trainer = AutoTrainer(
max_epochs=100,
precision="16-mixed",
callbacks=[
pl.callbacks.ModelCheckpoint(monitor="val/accuracy", mode="max"),
pl.callbacks.EarlyStopping(monitor="val/accuracy", patience=10),
],
)
trainer.fit(model, datamodule=ImageDataModule(data_dir="./data"))
Key Features:
- ✅ All three approaches fully compatible with PyTorch Lightning and AutoTrainer
- ✅ 47 automated tests with 100% pass rate
- ✅ Production-ready with checkpoint monitoring, early stopping, and mixed precision
- ✅ No special configuration needed - all features "just work"
Learn more about choosing the right approach →
Benefits:
- Centralized hosting: Access thousands of pretrained models
- Version control: Use specific model versions and configurations
- Model cards: View training details, datasets, and performance
- Community models: Share and use custom trained models
- Same API: Works exactly like standard timm models
Key Features
Multiple Loss Functions
Classification
- CrossEntropy with label smoothing
- Mixup augmentation
Detection
- Focal Loss (handles class imbalance)
- GIoU Loss (bbox regression)
- Centerness Loss (prediction quality)
Segmentation
- Dice Loss (overlap-based)
- Combined Loss (CE + Dice)
- Focal Loss (class imbalance)
- Tversky Loss (FP/FN weighting)
Flexible Data Loading
- Torchvision: PIL-based transforms (fast CPU)
- Albumentations: OpenCV-based transforms (advanced augmentations)
- Multiple Formats: COCO, Cityscapes, Pascal VOC, ImageFolder, PNG masks
- Custom Datasets: Easy integration with PyTorch DataLoaders
Advanced Training Features
- Auto-tuning: LR finder and batch size finder
- Multi-GPU: Distributed training with DDP
- Mixed Precision: Automatic mixed precision (AMP)
- Gradient Accumulation: Train larger batch sizes
- Early Stopping: Prevent overfitting
- Checkpointing: Save best models automatically
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
# Setup development environment
git clone https://github.com/theja-vanka/AutoTimm.git
cd AutoTimm
pip install -e ".[dev,all]"
# Run tests
pytest tests/ -v
Testing
All tests pass successfully (130 passed, 3 skipped).
# Run all tests
pytest tests/ -v
# Run specific test modules
pytest tests/test_classification.py
pytest tests/test_semantic_segmentation.py
pytest tests/test_segmentation_losses.py
# With coverage
pytest tests/ --cov=autotimm --cov-report=html
Recent Fixes:
- ✅ Fixed
RuntimeErrorwhen callingconfigure_optimizers()without attached trainer (semantic segmentation) - ✅ Improved scheduler initialization to handle cases where model is not yet attached to trainer
Citation
If you use AutoTimm in your research, please cite:
@software{autotimm,
author = {Krishnatheja Vanka},
title = {AutoTimm: Automated Deep Learning for Computer Vision},
url = {https://github.com/theja-vanka/AutoTimm},
year = {2026}
}
License
Apache 2.0 — see LICENSE for details.
Built with timm and PyTorch Lightning
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autotimm-0.5.4.tar.gz.
File metadata
- Download URL: autotimm-0.5.4.tar.gz
- Upload date:
- Size: 4.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d581c1c381a469804c57a0fae6d153a96ab2725da606b45f7eba0d2c31ce85c
|
|
| MD5 |
1fe866d3eb7281c0b4bd298a7df5503d
|
|
| BLAKE2b-256 |
6c53f3786c16df6f6866e5710b3b51e856933e1c3ed4010081ee7361da794e2c
|
File details
Details for the file autotimm-0.5.4-py3-none-any.whl.
File metadata
- Download URL: autotimm-0.5.4-py3-none-any.whl
- Upload date:
- Size: 87.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
429fe1b4e23e7a1db0bc39c0ee78e2144046215e76c81504b87ee8ec4ff510fe
|
|
| MD5 |
8af2729c8b48eb1df6017ff5bc380b23
|
|
| BLAKE2b-256 |
999f182650f5fac5c2299c109db8bc3dd8e6fdddfa760fd01fe3c74c9f1cfdc2
|