(Unofficial) PyTorch Image Models

These details have not been verified by PyPI

Project links

Homepage

Project description

PyTorch Image Models, etc

Introduction

For each competition, personal, or freelance project involving images + Convolution Neural Networks, I build on top of an evolving collection of code and models. This repo contains a (somewhat) cleaned up and paired down iteration of that code. Hopefully it'll be of use to others.

The work of many others is present here. I've tried to make sure all source material is acknowledged:

Training/validation scripts evolved from early versions of the PyTorch Imagenet Examples
CUDA specific performance enhancements have been pulled from NVIDIA's APEX Examples
Models are from a wide variety of sources
LR scheduler ideas from AllenNLP, FAIRseq, and SGDR: Stochastic Gradient Descent with Warm Restarts (https://arxiv.org/abs/1608.03983)
Random Erasing from Zhun Zhong (https://arxiv.org/abs/1708.04896)

Models

I've included a few of my favourite models, but this is not an exhaustive collection. You can't do better than Cadene's collection in that regard. Most models do have pretrained weights from their respective sources or original authors.

ResNet/ResNeXt (from torchvision with mods by myself)
- ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152, ResNeXt50 (32x4d), ResNeXt101 (32x4d and 64x4d)
- 'Bag of Tricks' / Gluon C, D, E, S variations (https://arxiv.org/abs/1812.01187)
- Instagram trained / ImageNet tuned ResNeXt101-32x8d to 32x48d from from facebookresearch
DenseNet (from torchvision)
- DenseNet-121, DenseNet-169, DenseNet-201, DenseNet-161
Squeeze-and-Excitation ResNet/ResNeXt (from Cadene with some pretrained weight additions by myself)
- SENet-154, SE-ResNet-18, SE-ResNet-34, SE-ResNet-50, SE-ResNet-101, SE-ResNet-152, SE-ResNeXt-26 (32x4d), SE-ResNeXt50 (32x4d), SE-ResNeXt101 (32x4d)
Inception-ResNet-V2 and Inception-V4 (from Cadene )
Xception (from Cadene)
PNasNet & NASNet-A (from Cadene)
DPN (from me, weights hosted by Cadene)
- DPN-68, DPN-68b, DPN-92, DPN-98, DPN-131, DPN-107
Generic EfficientNet (from my standalone GenMobileNet) - A generic model that implements many of the efficient models that utilize similar DepthwiseSeparable and InvertedResidual blocks
- EfficientNet (B0-B7) (https://arxiv.org/abs/1905.11946) -- validated, compat with TF weights
- MixNet (https://arxiv.org/abs/1907.09595) -- validated, compat with TF weights
- MNASNet B1, A1 (Squeeze-Excite), and Small (https://arxiv.org/abs/1807.11626)
- MobileNet-V1 (https://arxiv.org/abs/1704.04861)
- MobileNet-V2 (https://arxiv.org/abs/1801.04381)
- MobileNet-V3 (https://arxiv.org/abs/1905.02244) -- pretrained model good, still no official impl to verify against
- ChamNet (https://arxiv.org/abs/1812.08934) -- specific arch details hard to find, currently an educated guess
- FBNet-C (https://arxiv.org/abs/1812.03443) -- TODO A/B variants
- Single-Path NAS (https://arxiv.org/abs/1904.02877) -- pixel1 variant

Use the --model arg to specify model for train, validation, inference scripts. Match the all lowercase creation fn for the model you'd like.

Features

Several (less common) features that I often utilize in my projects are included. Many of their additions are the reason why I maintain my own set of models, instead of using others' via PIP:

All models have a common default configuration interface and API for
- accessing/changing the classifier - get_classifier and reset_classifier
- doing a forward pass on just the features - forward_features
- these makes it easy to write consistent network wrappers that work with any of the models
All models have a consistent pretrained weight loader that adapts last linear if necessary, and from 3 to 1 channel input if desired
The train script works in several process/GPU modes:
- NVIDIA DDP w/ a single GPU per process, multiple processes with APEX present (AMP mixed-precision optional)
- PyTorch DistributedDataParallel w/ multi-gpu, single process (AMP disabled as it crashes when enabled)
- PyTorch w/ single GPU single process (AMP optional)
A dynamic global pool implementation that allows selecting from average pooling, max pooling, average + max, or concat([average, max]) at model creation. All global pooling is adaptive average by default and compatible with pretrained weights.
A 'Test Time Pool' wrapper that can wrap any of the included models and usually provide improved performance doing inference with input images larger than the training size. Idea adapted from original DPN implementation when I ported (https://github.com/cypw/DPNs)
Training schedules and techniques that provide competitive results (Cosine LR, Random Erasing, Label Smoothing, etc)
Mixup (as in https://arxiv.org/abs/1710.09412) - currently implementing/testing
An inference script that dumps output to CSV is provided as an example

Results

A CSV file containing an ImageNet-1K validation results summary for all included models with pretrained weights and default configurations is located here

Self-trained Weights

I've leveraged the training scripts in this repository to train a few of the models with missing weights to good levels of performance. These numbers are all for 224x224 training and validation image sizing with the usual 87.5% validation crop.

Model	Prec@1 (Err)	Prec@5 (Err)	Param #	Image Scaling	Image Size
efficientnet_b2	79.760 (20.240)	94.714 (5.286)	9.11M	bicubic	260
resnext50d_32x4d	79.674 (20.326)	94.868 (5.132)	25.1M	bicubic	224
mixnet_l	78.976 (21.024	94.184 (5.816)	7.33M	bicubic	224
efficientnet_b1	78.692 (21.308)	94.086 (5.914)	7.79M	bicubic	240
resnext50_32x4d	78.512 (21.488)	94.042 (5.958)	25M	bicubic	224
resnet50	78.470 (21.530)	94.266 (5.734)	25.6M	bicubic	224
mixnet_m	77.256 (22.744)	93.418 (6.582)	5.01M	bicubic	224
seresnext26_32x4d	77.104 (22.896)	93.316 (6.684)	16.8M	bicubic	224
efficientnet_b0	76.912 (23.088)	93.210 (6.790)	5.29M	bicubic	224
resnet26d	76.68 (23.32)	93.166 (6.834)	16M	bicubic	224
mixnet_s	75.988 (24.012)	92.794 (7.206)	4.13M	bicubic	224
mobilenetv3_100	75.634 (24.366)	92.708 (7.292)	5.5M	bicubic	224
mnasnet_a1	75.448 (24.552)	92.604 (7.396)	3.89M	bicubic	224
resnet26	75.292 (24.708)	92.57 (7.43)	16M	bicubic	224
fbnetc_100	75.124 (24.876)	92.386 (7.614)	5.6M	bilinear	224
resnet34	75.110 (24.890)	92.284 (7.716)	22M	bilinear	224
seresnet34	74.808 (25.192)	92.124 (7.876)	22M	bilinear	224
mnasnet_b1	74.658 (25.342)	92.114 (7.886)	4.38M	bicubic	224
spnasnet_100	74.084 (25.916)	91.818 (8.182)	4.42M	bilinear	224
seresnet18	71.742 (28.258)	90.334 (9.666)	11.8M	bicubic	224

Ported Weights

Model	Prec@1 (Err)	Prec@5 (Err)	Param #	Image Scaling	Image Size	Source
tf_efficientnet_b7 *tfp	84.480 (15.520)	96.870 (3.130)	66.35	bicubic	600	Google
tf_efficientnet_b7	84.420 (15.580)	96.906 (3.094)	66.35	bicubic	600	Google
tf_efficientnet_b6 *tfp	84.140 (15.860)	96.852 (3.148)	43.04	bicubic	528	Google
tf_efficientnet_b6	84.110 (15.890)	96.886 (3.114)	43.04	bicubic	528	Google
tf_efficientnet_b5 *tfp	83.694 (16.306)	96.696 (3.304)	30.39	bicubic	456	Google
tf_efficientnet_b5	83.688 (16.312)	96.714 (3.286)	30.39	bicubic	456	Google
tf_efficientnet_b4	83.022 (16.978)	96.300 (3.700)	19.34	bicubic	380	Google
tf_efficientnet_b4 *tfp	82.948 (17.052)	96.308 (3.692)	19.34	bicubic	380	Google
tf_efficientnet_b3 *tfp	81.576 (18.424)	95.662 (4.338)	12.23	bicubic	300	Google
tf_efficientnet_b3	81.636 (18.364)	95.718 (4.282)	12.23	bicubic	300	Google
gluon_senet154	81.224 (18.776)	95.356 (4.644)	115.09	bicubic	224
gluon_resnet152_v1s	81.012 (18.988)	95.416 (4.584)	60.32	bicubic	224
gluon_seresnext101_32x4d	80.902 (19.098)	95.294 (4.706)	48.96	bicubic	224
gluon_seresnext101_64x4d	80.890 (19.110)	95.304 (4.696)	88.23	bicubic	224
gluon_resnext101_64x4d	80.602 (19.398)	94.994 (5.006)	83.46	bicubic	224
gluon_resnet152_v1d	80.470 (19.530)	95.206 (4.794)	60.21	bicubic	224
gluon_resnet101_v1d	80.424 (19.576)	95.020 (4.980)	44.57	bicubic	224
gluon_resnext101_32x4d	80.334 (19.666)	94.926 (5.074)	44.18	bicubic	224
gluon_resnet101_v1s	80.300 (19.700)	95.150 (4.850)	44.67	bicubic	224
tf_efficientnet_b2 *tfp	80.188 (19.812)	94.974 (5.026)	9.11	bicubic	260	Google
tf_efficientnet_b2	80.086 (19.914)	94.908 (5.092)	9.11	bicubic	260	Google
gluon_resnet152_v1c	79.916 (20.084)	94.842 (5.158)	60.21	bicubic	224
gluon_seresnext50_32x4d	79.912 (20.088)	94.818 (5.182)	27.56	bicubic	224
gluon_resnet152_v1b	79.692 (20.308)	94.738 (5.262)	60.19	bicubic	224
gluon_resnet101_v1c	79.544 (20.456)	94.586 (5.414)	44.57	bicubic	224
gluon_resnext50_32x4d	79.356 (20.644)	94.424 (5.576)	25.03	bicubic	224
gluon_resnet101_v1b	79.304 (20.696)	94.524 (5.476)	44.55	bicubic	224
tf_efficientnet_b1 *tfp	79.172 (20.828)	94.450 (5.550)	7.79	bicubic	240	Google
gluon_resnet50_v1d	79.074 (20.926)	94.476 (5.524)	25.58	bicubic	224
tf_mixnet_l *tfp	78.846 (21.154)	94.212 (5.788)	7.33	bilinear	224	Google
tf_efficientnet_b1	78.826 (21.174)	94.198 (5.802)	7.79	bicubic	240	Google
gluon_inception_v3	78.804 (21.196)	94.380 (5.620)	27.16M	bicubic	299	MxNet Gluon
tf_mixnet_l	78.770 (21.230)	94.004 (5.996)	7.33	bicubic	224	Google
gluon_resnet50_v1s	78.712 (21.288)	94.242 (5.758)	25.68	bicubic	224
gluon_resnet50_v1c	78.010 (21.990)	93.988 (6.012)	25.58	bicubic	224
tf_inception_v3	77.856 (22.144)	93.644 (6.356)	27.16M	bicubic	299	Tensorflow Slim
gluon_resnet50_v1b	77.578 (22.422)	93.718 (6.282)	25.56	bicubic	224
adv_inception_v3	77.576 (22.424)	93.724 (6.276)	27.16M	bicubic	299	Tensorflow Adv models
tf_efficientnet_b0 *tfp	77.258 (22.742)	93.478 (6.522)	5.29	bicubic	224	Google
tf_mixnet_m *tfp	77.072 (22.928)	93.368 (6.632)	5.01	bilinear	224	Google
tf_mixnet_m	76.950 (23.050)	93.156 (6.844)	5.01	bicubic	224	Google
tf_efficientnet_b0	76.848 (23.152)	93.228 (6.772)	5.29	bicubic	224	Google
tf_mixnet_s *tfp	75.800 (24.200)	92.788 (7.212)	4.13	bilinear	224	Google
tf_mixnet_s	75.648 (24.352)	92.636 (7.364)	4.13	bicubic	224	Google
gluon_resnet34_v1b	74.580 (25.420)	91.988 (8.012)	21.80	bicubic	224
gluon_resnet18_v1b	70.830 (29.170)	89.756 (10.244)	11.69	bicubic	224

Models with *tfp next to them were scored with --tf-preprocessing flag.

The tf_efficientnet, tf_mixnet models require an equivalent for 'SAME' padding as their arch results in asymmetric padding. I've added this in the model creation wrapper, but it does come with a performance penalty.

Usage

Environment

All development and testing has been done in Conda Python 3 environments on Linux x86-64 systems, specifically Python 3.6.x and 3.7.x. Little to no care has been taken to be Python 2.x friendly and I don't plan to support it. If you run into any challenges running on Windows, or other OS, I'm definitely open to looking into those issues so long as it's in a reproducible (read Conda) environment.

PyTorch versions 1.0 and 1.1 have been tested with this code.

I've tried to keep the dependencies minimal, the setup is as per the PyTorch default install instructions for Conda:

conda create -n torch-env
conda activate torch-env
conda install -c pytorch pytorch torchvision cudatoolkit=10.0

Pip

This package can be installed via pip. Currently, the model factory (timm.create_model) is the most useful component to use via a pip install.

Install (after conda env/install):

pip install timm

Use:

>>> import timm
>>> m = timm.create_model('mobilenetv3_100', pretrained=True)
>>> m.eval()

Scripts

A train, validation, inference, and checkpoint cleaning script included in the github root folder. Scripts are not currently packaged in the pip release.

Training

The variety of training args is large and not all combinations of options (or even options) have been fully tested. For the training dataset folder, specify the folder to the base that contains a train and validation folder.

To train an SE-ResNet34 on ImageNet, locally distributed, 4 GPUs, one process per GPU w/ cosine schedule, random-erasing prob of 50% and per-pixel random value:

./distributed_train.sh 4 /data/imagenet --model seresnet34 --sched cosine --epochs 150 --warmup-epochs 5 --lr 0.4 --reprob 0.5 --remode pixel --batch-size 256 -j 4

NOTE: NVIDIA APEX should be installed to run in per-process distributed via DDP or to enable AMP mixed precision with the --amp flag

Validation / Inference

Validation and inference scripts are similar in usage. One outputs metrics on a validation set and the other outputs topk class ids in a csv. Specify the folder containing validation images, not the base as in training script.

To validate with the model's pretrained weights (if they exist):

python validate.py /imagenet/validation/ --model seresnext26_32x4d --pretrained

To run inference from a checkpoint:

python inference.py /imagenet/validation/ --model mobilenetv3_100 --checkpoint ./output/model_best.pth.tar

TODO

A number of additions planned in the future for various projects, incl

Do a model performance (speed + accuracy) benchmarking across all models (make runable as script)
Add usage examples to comments, good hyper params for training
Comments, cleanup and the usual things that get pushed back

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.25

Feb 23, 2026

1.0.24

Jan 7, 2026

1.0.23

Jan 5, 2026

1.0.22

Nov 5, 2025

1.0.21

Oct 24, 2025

1.0.20

Sep 21, 2025

1.0.19

Jul 24, 2025

1.0.18

Jul 23, 2025

1.0.17

Jul 10, 2025

1.0.16

Jun 26, 2025

1.0.15

Feb 23, 2025

1.0.14

Jan 19, 2025

1.0.13

Jan 9, 2025

1.0.12

Dec 3, 2024

1.0.11

Oct 16, 2024

1.0.10

Oct 15, 2024

1.0.9

Aug 23, 2024

1.0.8

Jul 29, 2024

1.0.7

Jun 19, 2024

1.0.3

May 15, 2024

0.9.16

Feb 19, 2024

0.9.12

Nov 24, 2023

0.9.11

Nov 20, 2023

0.9.10

Nov 4, 2023

0.9.9

Nov 3, 2023

0.9.8

Oct 21, 2023

0.9.7

Sep 2, 2023

0.9.6

Aug 29, 2023

0.9.5

Aug 3, 2023

0.9.2

May 14, 2023

0.9.1

May 12, 2023

0.9.0

May 12, 2023

0.8.23.dev0 pre-release

May 10, 2023

0.8.21.dev0 pre-release

Apr 28, 2023

0.8.19.dev0 pre-release

Apr 6, 2023

0.8.17.dev0 pre-release

Mar 23, 2023

0.8.15.dev0 pre-release

Feb 27, 2023

0.8.13.dev0 pre-release

Feb 20, 2023

0.8.11.dev0 pre-release

Feb 10, 2023

0.8.10.dev0 pre-release

Feb 7, 2023

0.8.6.dev0 pre-release

Jan 12, 2023

0.8.3.dev0 pre-release

Dec 24, 2022

0.8.2.dev0 pre-release

Dec 24, 2022

0.8.0.dev0 pre-release

Dec 5, 2022

0.6.13

Mar 24, 2023

0.6.12

Nov 23, 2022

0.6.11

Oct 3, 2022

0.6.7

Jul 27, 2022

0.6.5

Jul 10, 2022

0.6.2.dev0 pre-release

May 15, 2022

0.5.4

Jan 17, 2022

0.4.12

Jun 30, 2021

0.4.9

May 18, 2021

0.4.5

Mar 8, 2021

0.3.4

Jan 6, 2021

0.3.3

Jan 4, 2021

0.3.2

Dec 7, 2020

0.3.1

Nov 1, 2020

0.3.0

Oct 30, 2020

0.2.1

Aug 13, 2020

0.1.30

Jun 17, 2020

0.1.28

Jun 12, 2020

0.1.26

May 4, 2020

0.1.24

May 4, 2020

0.1.22

Apr 28, 2020

0.1.20

Apr 9, 2020

0.1.18

Feb 22, 2020

0.1.16

Feb 3, 2020

0.1.14

Sep 18, 2019

This version

0.1.12

Aug 5, 2019

0.1.10

Jul 26, 2019

0.1.8

Jul 5, 2019

0.1.6

Jun 30, 2019

0.1.4

Jun 30, 2019

0.1.2

Jun 24, 2019

0.1.1

Jun 21, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

timm-0.1.12.tar.gz (72.8 kB view details)

Uploaded Aug 5, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

timm-0.1.12-py3-none-any.whl (91.1 kB view details)

Uploaded Aug 5, 2019 Python 3

File details

Details for the file timm-0.1.12.tar.gz.

File metadata

Download URL: timm-0.1.12.tar.gz
Upload date: Aug 5, 2019
Size: 72.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for timm-0.1.12.tar.gz
Algorithm	Hash digest
SHA256	`a60c8c435aa38f96c1cc7fc03710964fc72c195d544dbd4327f4846b3e964add`
MD5	`af26ca842fc982c09a03e33971f7993e`
BLAKE2b-256	`9d6e3d537d5be283c8e219302e00730abe409b595b10427447ab5094687d0d74`

See more details on using hashes here.

File details

Details for the file timm-0.1.12-py3-none-any.whl.

File metadata

Download URL: timm-0.1.12-py3-none-any.whl
Upload date: Aug 5, 2019
Size: 91.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for timm-0.1.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a91faa389aab86ebdba0a7e72db52170b64154f9b79f8d6c345caa1ad4c07b33`
MD5	`07c2a92e3e53aa8809eb9ecfc1614872`
BLAKE2b-256	`a4d8ba4d2bad66d97e6864abae4dfe2688e8049104712a7bb6b70dc3dc2508dc`

See more details on using hashes here.

timm 0.1.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyTorch Image Models, etc

Introduction

Models

Features

Results

Self-trained Weights

Ported Weights

Usage

Environment

Pip

Scripts

Training

Validation / Inference

TODO

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes