Skip to main content

Use pre-trained models in PyTorch to extract vector embeddings for any image

Project description

Image 2 Vec with PyTorch

Repository forked from img2vec, removed some noisy warnings and supports Pytorch Metal.

Installation:

pip install img2vec-pytorch-2

Medium post on building the first version from scratch: https://becominghuman.ai/extract-a-feature-vector-for-any-image-with-pytorch-9717561d1d4c

Applications of image embeddings:

  • Ranking for recommender systems
  • Clustering images to different categories
  • Classification tasks
  • Image compression

Available models

Model name Return vector length
Resnet-18 512
Alexnet 4096
Vgg-11 4096
Densenet 1024
efficientnet_b0 1280
efficientnet_b1 1280
efficientnet_b2 1408
efficientnet_b3 1536
efficientnet_b4 1792
efficientnet_b5 2048
efficientnet_b6 2304
efficientnet_b7 2560

Installation

Tested on Python 3.6 and torchvision 0.11.0 (nightly, 2021-09-25)

Requires Pytorch: http://pytorch.org/

conda install -c pytorch-nightly torchvision

pip install img2vec-pytorch-2

Pytorch Metal is supported:

pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Run test

python -m unittest discover tests

Using img2vec as a library

from img2vec_pytorch import Img2Vec
from PIL import Image

# Initialize Img2Vec with GPU
img2vec = Img2Vec(device_preference=["cuda", "cpu"])

# Read in an image (rgb format)
img = Image.open('test.jpg')
# Get a vector from img2vec, returned as a torch FloatTensor
vec = img2vec.get_vec(img, tensor=True)
# Or submit a list
vectors = img2vec.get_vec(list_of_PIL_images)
For running the example, you will additionally need:
  • Pillow: pip install Pillow
  • Sklearn pip install scikit-learn

Running the example

git clone https://github.com/psilabs-dev/img2vec.git

cd img2vec/example

python test_img_similarity.py

Expected output

Which filename would you like similarities for?
cat.jpg
0.72832 cat2.jpg
0.641478 catdog.jpg
0.575845 face.jpg
0.516689 face2.jpg

Which filename would you like similarities for?
face2.jpg
0.668525 face.jpg
0.516689 cat.jpg
0.50084 cat2.jpg
0.484863 catdog.jpg

Try adding your own photos!

Img2Vec Params

cuda = (True, False)   # Run on GPU?     default: False
model = ('resnet-18', 'efficientnet_b0', etc.)   # Which model to use?     default: 'resnet-18'

Advanced users


Read only file systems

If you use this library from the app running in read only environment (for example, docker container), specify writable directory where app can store pre-trained models.

export TORCH_HOME=/tmp/torch

Additional Parameters

layer = 'layer_name' or int   # For advanced users, which layer of the model to extract the output from.   default: 'avgpool'
layer_output_size = int   # Size of the output of your selected layer
gpu = (0, 1, etc.)   # Which GPU to use?     default: 0

Resnet-18

Defaults: (layer = 'avgpool', layer_output_size = 512)
Layer parameter must be an string representing the name of a layer below

conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
bn1 = nn.BatchNorm2d(64)
relu = nn.ReLU(inplace=True)
maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
layer1 = self._make_layer(block, 64, layers[0])
layer2 = self._make_layer(block, 128, layers[1], stride=2)
layer3 = self._make_layer(block, 256, layers[2], stride=2)
layer4 = self._make_layer(block, 512, layers[3], stride=2)
avgpool = nn.AvgPool2d(7)
fc = nn.Linear(512 * block.expansion, num_classes)

Alexnet

Defaults: (layer = 2, layer_output_size = 4096)
Layer parameter must be an integer representing one of the layers below

alexnet.classifier = nn.Sequential(
            7. nn.Dropout(),                  < - output_size = 9216
            6. nn.Linear(256 * 6 * 6, 4096),  < - output_size = 4096
            5. nn.ReLU(inplace=True),         < - output_size = 4096
            4. nn.Dropout(),		      < - output_size = 4096
            3. nn.Linear(4096, 4096),	      < - output_size = 4096
            2. nn.ReLU(inplace=True),         < - output_size = 4096
            1. nn.Linear(4096, num_classes),  < - output_size = 4096
        )

Vgg

Defaults: (layer = 2, layer_output_size = 4096)

vgg.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, num_classes),
        )

Densenet

Defaults: (layer = 1 from features, layer_output_size = 1024)

densenet.features = nn.Sequential(OrderedDict([
	('conv0', nn.Conv2d(3, num_init_features, kernel_size=7, stride=2,
						padding=3, bias=False)),
	('norm0', nn.BatchNorm2d(num_init_features)),
	('relu0', nn.ReLU(inplace=True)),
	('pool0', nn.MaxPool2d(kernel_size=3, stride=2, padding=1)),
]))

EfficientNet

Defaults: (layer = 1 from features, layer_output_size = 1280 for efficientnet_b0 model)

To-do

  • Benchmark speed and accuracy
  • Add ability to fine-tune on input data
  • Export documentation to a normal place

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2vec_pytorch_2-1.3.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2vec_pytorch_2-1.3.0-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file img2vec_pytorch_2-1.3.0.tar.gz.

File metadata

  • Download URL: img2vec_pytorch_2-1.3.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for img2vec_pytorch_2-1.3.0.tar.gz
Algorithm Hash digest
SHA256 367229decfa22b1135d769a3bb51af0043ce0a50889075fa3bad7194e2eb879e
MD5 ef44f5b76b13419fdf43318fcf562dbd
BLAKE2b-256 d5d3a4b49c86a44341a676ee3fe9e097742059a1c977ab3934a7d4c11d294e1c

See more details on using hashes here.

File details

Details for the file img2vec_pytorch_2-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for img2vec_pytorch_2-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae481c683f1c9d297a48900f7336185b49632447338a0bcdb59f7e9a424f2b4b
MD5 4165320fd89cf561735939a2007e956a
BLAKE2b-256 a3e5c557de27c3664d31b3e6594f2bc19228cdee70aca7184adec3eded3a0b54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page