Skip to main content

Use pre-trained models in PyTorch to extract vector embeddings for any image

Project description

Image 2 Vec with PyTorch

Medium post on building the first version from scratch: https://becominghuman.ai/extract-a-feature-vector-for-any-image-with-pytorch-9717561d1d4c

Applications of image embeddings:

  • Ranking for recommender systems
  • Clustering images to different categories
  • Classification tasks

Available models

  • Resnet-18 (CPU, GPU)
    • Returns vector length 512
  • Alexnet (CPU, GPU)
    • Returns vector length 4096

Installation

Tested on Python 3.6

Dependencies

Pytorch: http://pytorch.org/

Pillow: pip install Pillow

For running the example, you will additionally need:
  • Sklearn pip install scikit-learn

Running the example

git clone https://github.com/christiansafka/img2vec.git

cd img2vec/example

python test_img_to_vec.py

Expected output

Which filename would you like similarities for?
cat.jpg
0.72832 cat2.jpg
0.641478 catdog.jpg
0.575845 face.jpg
0.516689 face2.jpg

Which filename would you like similarities for?
face2.jpg
0.668525 face.jpg
0.516689 cat.jpg
0.50084 cat2.jpg
0.484863 catdog.jpg

Try adding your own photos!

Using img2vec as a library

from img_to_vec import Img2Vec
from PIL import Image

# Initialize Img2Vec with GPU
img2vec = Img2Vec(cuda=True)

# Read in an image
img = Image.open('test.jpg')
# Get a vector from img2vec
vec = img2vec.get_vec(img)
# Or submit a list
vectors = img2vec.get_vec(list_of_PIL_images)

Img2Vec Params

cuda = (True, False)   # Run on GPU?     default: False
model = ('resnet-18', 'alexnet')   # Which model to use?     default: 'resnet-18'

Advanced users


Additional Parameters

layer = 'layer_name' or int   # For advanced users, which layer of the model to extract the output from.   default: 'avgpool'
layer_output_size = int   # Size of the output of your selected layer

Resnet-18

Defaults: (layer = 'avgpool', layer_output_size = 512)
Layer parameter must be an string representing the name of a layer below

conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
bn1 = nn.BatchNorm2d(64)
relu = nn.ReLU(inplace=True)
maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
layer1 = self._make_layer(block, 64, layers[0])
layer2 = self._make_layer(block, 128, layers[1], stride=2)
layer3 = self._make_layer(block, 256, layers[2], stride=2)
layer4 = self._make_layer(block, 512, layers[3], stride=2)
avgpool = nn.AvgPool2d(7)
fc = nn.Linear(512 * block.expansion, num_classes)

Alexnet

Defaults: (layer = 2, layer_output_size = 4096)
Layer parameter must be an integer representing one of the layers below

alexnet.classifier = nn.Sequential(
            7. nn.Dropout(),                  < - output_size = 9216
            6. nn.Linear(256 * 6 * 6, 4096),  < - output_size = 4096
            5. nn.ReLU(inplace=True),         < - output_size = 4096
            4. nn.Dropout(),		      < - output_size = 4096
            3. nn.Linear(4096, 4096),	      < - output_size = 4096
            2. nn.ReLU(inplace=True),         < - output_size = 4096
            1. nn.Linear(4096, num_classes),  < - output_size = 4096
        )

To-do

  • Benchmark speed and accuracy
  • Add ability to fine-tune on input data
  • Export documentation to a normal place
  • Package for Pip

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2vec_pytorch-0.2.5.tar.gz (4.2 kB view details)

Uploaded Source

File details

Details for the file img2vec_pytorch-0.2.5.tar.gz.

File metadata

  • Download URL: img2vec_pytorch-0.2.5.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for img2vec_pytorch-0.2.5.tar.gz
Algorithm Hash digest
SHA256 668e2f8916b388d6bf749736a45c24dfcb541541a881b9ff71221d431275cb9a
MD5 cd3d3c86116ddc118ae676dedada2758
BLAKE2b-256 94e66f1e8d1918d2272c6b9b61b249bc255973f8b2aa15ed79d8e96fd7423a7a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page