Compact, concise and customizable deep learning computer vision

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Glasses 😎

alt

Compact, concise and customizable deep learning computer vision library

This is an early beta, code will change and pretrained weights are not available (I need to find a place to store them online, any advice?)

Doc is here

Installation

You can install glasses using pip by running

pip install glasses

Motivation

All the existing implementation of the most famous model are written with very bad coding practices, what today is called research code. I struggled myself to understand some of the implementation that in the end were just few lines of code.

Most of them are missing a global structure, they used tons of code repetition, they are not easily customizable and not tested. Since I do computer vision for living, so I needed a way to make my life easier.

Getting started

The API are shared across all models!

import torch
from glasses.nn.models import *
from torch import nn

model = ResNet.resnet18(pretrained=True)
model.summary() #thanks to torchsummary
# change activation
ResNet.resnet18(activation = nn.SELU)
# change number of classes
ResNet.resnet18(n_classes=100)
# freeze only the convolution weights
model = ResNet.resnet18(pretrained=True)
model.freeze(who=model.encoder)
# get the last layer, usuful to hook to it if you want to get the embeeded vector
model.encoder.blocks[-1]
# what about resnet with inverted residuals?
from glasses.nn.models.classification.mobilenet import InvertedResidualBlock
ResNet.resnet18(block = InvertedResidualBlock)

# change the decoder part
model = ResNet.resnet18(pretrained=True)
my_decoder = nn.Sequential(
    nn.AdaptiveAvgPool2d((1,1)),
    nn.Flatten(),
    nn.Linear(model.encoder.widths[-1], 512),
    nn.Dropout(0.2),
    nn.ReLU(),
    nn.Linear(512, 1000))

model.decoder = my_decoder

x = torch.rand((1,3,224,224))
model(x).shape #torch.Size([1, 1000])

Deep Customization

All models is composed by 4 parts:

Block
Layer
Encoder
Decoder

Block

Each model has its building block, they are noted by *Block. In each block, all the weights are in the .block field. This makes it very easy to customize one specific model.

from glasses.nn.models.classification.vgg import VGGBasicBlock
from glasses.nn.models.classification.resnet import ResNetBasicBlock, ResNetBottleneckBlock, ResNetBasicPreActBlock, ResNetBottleneckPreActBlock
from glasses.nn.models.classification.senet import SENetBasicBlock, SENetBottleneckBlock
from glasses.nn.models.classification.resnetxt import ResNetXtBottleNeckBlock
from glasses.nn.models.classification.densenet import DenseBottleNeckBlock
from glasses.nn.models.classification.wide_resnet import WideResNetBottleNeckBlock
from glasses.nn.models.classification.mobilenet import MobileNetBasicBlock
from glasses.nn.models.classification.efficientnet import EfficientNetBasicBlock

For example, if we want to add Squeeze and Excitation to the resnet bottleneck block, we can just

from glasses.nn.models.classification.se import SpatialSE
from  glasses.nn.models.classification.resnet import ResNetBottleneckBlock

class SEResNetBottleneckBlock(ResNetBottleneckBlock):
    def __init__(self, in_features: int, out_features: int, squeeze: int = 16, *args, **kwargs):
        super().__init__(in_features, out_features, *args, **kwargs)
        # all the weights are in block, we want to apply se after the weights
        self.block.add_module('se', SpatialSE(out_features, reduction=squeeze))

SEResNetBottleneckBlock(32, 64)

Then, we can use the class methods to create the new models following the existing architecture blueprint, for example, to create se_resnet50

ResNet.resnet50(block=ResNetBottleneckBlock)

The cool thing is each model has the same api, if I want to create a vgg13 with the ResNetBottleneckBlock I can just

model = VGG.vgg13(block=SEResNetBottleneckBlock)
model.summary()

Some specific model can require additional parameter to the block, for example MobileNetV2 also required a expansion parameter so our SEResNetBottleneckBlock won't work.

Layer

A Layer is a collection of blocks, it is used to stack multiple blocks together following some logic. For example, ResNetLayer

from glasses.nn.models.classification.resnet import ResNetLayer

ResNetLayer(64, 128, n=2)

Encoder

The encoder is what encoders a vector, so the convolution layers. It has always two very important parameters.

widths
depths

widths is the wide at each layer, so how much features there are depths is the depth at each layer, so how many blocks there are

For example, ResNetEncoder will creates multiple ResNetLayer based on the len of widths and depths. Let's see some example.

from glasses.nn.models.classification.resnet import ResNetEncoder
# 3 layers, with 32,64,128 features and 1,2,3 block each
ResNetEncoder(
    widths=[32,64,128],
    depths=[1,2,3])

Remember each model has always a .decoder field

from glasses.nn.models import ResNet

model = ResNet.resnet18()
model.encoder.widths[-1]

The encoder knows the number of output features, you can access them by

Decoder

The decoder takes the last feature from the .encoder and decode it. Usually it is just a linear layer. The ResNetDecoder looks like

from glasses.nn.models.classification.resnet import ResNetDecoder


ResNetDecoder(512, n_classes=1000)

This object oriented structure allows to reuse most of the code across the models

Models

The models so far

name	Parameters	Size (MB)
resnet18	11,689,512	44.59
resnet26	15,995,176	61.02
resnet34	21,797,672	83.15
resnet50	25,557,032	97.49
resnet101	44,549,160	169.94
resnet152	60,192,808	229.62
resnet200	64,673,832	246.71
resnext50_32x4d	25,028,904	95.48
resnext101_32x8d	88,791,336	338.71
wide_resnet50_2	68,883,240	262.77
wide_resnet101_2	126,886,696	484.03
se_resnet18	11,776,552	44.92
se_resnet34	21,954,856	83.75
se_resnet50	28,071,976	107.09
se_resnet101	49,292,328	188.04
se_resnet152	66,770,984	254.71
densenet121	7,978,856	30.44
densenet161	28,681,000	109.41
densenet169	14,149,480	53.98
densenet201	20,013,928	76.35
MobileNetV2	3,504,872	13.37
fishnet99	16,630,312	63.44
fishnet150	24,960,808	95.22
efficientnet_b0	5,288,548	20.17
efficientnet_b1	7,794,184	29.73
efficientnet_b2	9,109,994	34.75
efficientnet_b3	12,233,232	46.67
efficientnet_b4	19,341,616	73.78
efficientnet_b5	30,389,784	115.93
efficientnet_b6	43,040,704	164.19
efficientnet_b7	66,347,960	253.1
efficientnet_b8	87,413,142	333.45
efficientnet_l2	480,309,308	1832.23

Credits

Most of the weights were trained by other people and adapted to glasses. It is worth cite

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.6

Jan 25, 2021

0.0.5

Dec 26, 2020

0.0.4

Oct 30, 2020

This version

0.0.3

Oct 26, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glasses-0.0.3.tar.gz (37.8 kB view hashes)

Uploaded Oct 26, 2020 Source

Built Distribution

glasses-0.0.3-py3-none-any.whl (46.1 kB view hashes)

Uploaded Oct 26, 2020 Python 3

Hashes for glasses-0.0.3.tar.gz

Hashes for glasses-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`4fd60380b298bd2e8a860f6285ef810cf87f01d0c95782d18b19624b81d9204a`
MD5	`d60d474a727829708d4adce1e875a0a6`
BLAKE2b-256	`887c366a223f1d15b92d83864e18fc308df3f7722f78db7627cab13b12203d43`

Hashes for glasses-0.0.3-py3-none-any.whl

Hashes for glasses-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1fc95e29458e5d8e752078bb3580bd5026d7a92584b811867aa4b869059266c6`
MD5	`6c0fc8a198ec04d30ac27b42cc377dfd`
BLAKE2b-256	`4626fe24208494ceaa51d345d10d91fc42f74c566594ba2cb50a5cffe23c6d54`