TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights

These details have not been verified by PyPI

Project links

Project description

TensorFlow Image Models

Test Status

Introduction
Usage
Models
Profiling
License

Introduction

TensorFlow Image Models (tfimm) is a collection of image models with pretrained weights, obtained by porting architectures from timm to TensorFlow. The hope is that the number of available architectures will grow over time. For now, it contains vision transformers (ViT, DeiT, CaiT, PVT and Swin Transformers), MLP-Mixer models (MLP-Mixer, ResMLP, gMLP, PoolFormer and ConvMixer) and various ResNet flavours (ResNet, ResNeXt, ECA-ResNet, SE-ResNet) as well as the recent ConvNeXt.

This work would not have been possible wihout Ross Wightman's timm library and the work on PyTorch/TensorFlow interoperability in HuggingFace's transformer repository. I tried to make sure all source material is acknowledged. Please let me know if I have missed something.

Usage

Installation

The package can be installed via pip,

pip install tfimm

To load pretrained weights, timm needs to be installed separately.

Creating models

To load pretrained models use

import tfimm

model = tfimm.create_model("vit_tiny_patch16_224", pretrained="timm")

We can list available models with pretrained weights via

import tfimm

print(tfimm.list_models(pretrained="timm"))

Most models are pretrained on ImageNet or ImageNet-21k. If we want to use them for other tasks we need to change the number of classes in the classifier or remove the classifier altogether. We can do this by setting the nb_classes parameter in create_model. If nb_classes=0, the model will have no classification layer. If nb_classes is set to a value different from the default model config, the classification layer will be randomly initialized, while all other weights will be copied from the pretrained model.

The preprocessing function for each model can be created via

import tensorflow as tf
import tfimm

preprocess = tfimm.create_preprocessing("vit_tiny_patch16_224", dtype="float32")
img = tf.ones((1, 224, 224, 3), dtype="uint8")
img_preprocessed = preprocess(img)

Saving and loading models

All models are subclassed from tf.keras.Model (they are not functional models). They can still be saved and loaded using the SavedModel format.

>>> import tesnorflow as tf
>>> import tfimm
>>> model = tfimm.create_model("vit_tiny_patch16_224")
>>> type(model)
<class 'tfimm.architectures.vit.ViT'>
>>> model.save("/tmp/my_model")
>>> loaded_model = tf.keras.models.load_model("/tmp/my_model")
>>> type(loaded_model)
<class 'tfimm.architectures.vit.ViT'>

For this to work, the tfimm library needs to be imported before the model is loaded, since during the import process, tfimm is registering custom models with Keras. Otherwise, we obtain the following output

>>> import tensorflow as tf
>>> loaded_model = tf.keras.models.load_model("/tmp/my_model")
>>> type(loaded_model)
<class 'keras.saving.saved_model.load.Custom>ViT'>

Models

The following architectures are currently available:

CaiT (vision transformer) [github]
- Going deeper with Image Transformers [arXiv:2103.17239]
DeiT (vision transformer) [github]
- Training data-efficient image transformers & distillation through attention. [arXiv:2012.12877]
ViT (vision transformer) [github]
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. [arXiv:2010.11929]
- How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers. [arXiv:2106.10270]
- Includes models trained with the SAM optimizer: Sharpness-Aware Minimization for Efficiently Improving Generalization. [arXiv:2010.01412]
- Includes models from: ImageNet-21K Pretraining for the Masses [arXiv:2104.10972] [github]
Swin Transformer [github]
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. [arXiv:2103.14030]
- Tensorflow code adapted from Swin-Transformer-TF
MLP-Mixer and friends
- MLP-Mixer: An all-MLP Architecture for Vision [arXiv:2105.01601]
- ResMLP: Feedforward networks for image classification... [arXiv:2105.03404]
- Pay Attention to MLPs (gMLP) [arXiv:2105.08050]
ConvMixer [github]
- Patches Are All You Need? [ICLR 2022 submission]
Pyramid Vision Transformer [github]
- Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. [arXiv:2102.12122]
- PVTv2: Improved Baselines with Pyramid Vision Transformer [arXiv:2106.13797]
ConvNeXt [github]
- A ConvNet for the 2020s. [arXiv:2201.03545]
PoolFormer [github]
- PoolFormer: MetaFormer is Actually What You Need for Vision. [arXiv:2111.11418]
Pooling-based Vision Transformers (PiT)
- Rethinking Spatial Dimensions of Vision Transformers. [arXiv:2103.16302]
ResNet, ResNeXt, ECA-ResNet, SE-ResNet and friends
- Deep Residual Learning for Image Recognition. [arXiv:1512.03385]
- Exploring the Limits of Weakly Supervised Pretraining. [arXiv:1805.00932]
- Billion-scale Semi-Supervised Learning for Image Classification. [arXiv:1905.00546]
- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. [arXiv1910.03151]
- Revisiting ResNets. [arXiv:2103.07579]
- Making Convolutional Networks Shift-Invariant Again. (anti-aliasing layer) [arXiv:1904.11486]
- Squeeze-and-Excitation Networks. [arXiv:1709.01507]
- Big Transfer (BiT): General Visual Representation Learning [arXiv:1912.11370]
- Knowledge distillation: A good teacher is patient and consistent [arXiv:2106:05237]

Profiling

To understand how big each of the models is, I have done some profiling to measure

maximum batch size that fits in GPU memory and
throughput in images/second for both inference and backpropagation on K80 and V100 GPUs. For V100, measurements were done for both float32 and mixed precision.

The results can be found in the results/profiling_{k80, v100}.csv files.

For backpropagation, we use as loss the mean of model outputs

def backprop():
    with tf.GradientTape() as tape:
        output = model(x, training=True)
        loss = tf.reduce_mean(output)
        grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.14

May 15, 2023

0.2.13

May 3, 2023

0.2.12

Apr 17, 2023

0.2.11

Apr 17, 2023

0.2.10

Feb 28, 2023

0.2.9

Oct 28, 2022

0.2.8

Sep 5, 2022

0.2.7

Jun 14, 2022

0.2.6

May 13, 2022

0.2.5

Feb 21, 2022

This version

0.2.4

Jan 31, 2022

0.2.3

Jan 20, 2022

0.2.2

Jan 17, 2022

0.2.1

Jan 7, 2022

0.2.0

Jan 3, 2022

0.1.5

Dec 12, 2021

0.1.4

Dec 8, 2021

0.1.3

Dec 7, 2021

0.1.2

Nov 25, 2021

0.1.1

Nov 21, 2021

0.1.0

Nov 17, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tfimm-0.2.4.tar.gz (103.4 kB view details)

Uploaded Jan 31, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tfimm-0.2.4-py3-none-any.whl (138.9 kB view details)

Uploaded Jan 31, 2022 Python 3

File details

Details for the file tfimm-0.2.4.tar.gz.

File metadata

Download URL: tfimm-0.2.4.tar.gz
Upload date: Jan 31, 2022
Size: 103.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.6 CPython/3.7.12 Linux/5.11.0-1027-azure

File hashes

Hashes for tfimm-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`88b1e73be35e0611f365574cffec8833135cfbe4dd5658a1bb213ed0cb3310ef`
MD5	`23d23fb6b05d35752932df04be9b8144`
BLAKE2b-256	`0675bd379ffa2f9471497c41af0db85b8e843d399fe8e61f0bae794b393dd5ba`

See more details on using hashes here.

File details

Details for the file tfimm-0.2.4-py3-none-any.whl.

File metadata

Download URL: tfimm-0.2.4-py3-none-any.whl
Upload date: Jan 31, 2022
Size: 138.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.6 CPython/3.7.12 Linux/5.11.0-1027-azure

File hashes

Hashes for tfimm-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d1dc677c160f7e494b13eab6ce07c940800430c8bc7c2938b5173c886c109416`
MD5	`d00893cd72c89e05af722d48c97e9f49`
BLAKE2b-256	`644093bfd11af76e2ebfae69ff0f5e6671f9db07715e39cbd872b1c2dc530b29`

See more details on using hashes here.

tfimm 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TensorFlow Image Models

Introduction

Usage

Installation

Creating models

Saving and loading models

Models

Profiling

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes