Convert image to feature based on convolutional neural network (CNN).
Project description
It is an image feature extractor based on a convolutional neural network.
Installation
% pip install img2feat
Required libraries: numpy, torch, torchvision, opencv-python
class CNN
The CNN converts a list of numpy images to features, where numpy image is assumed opencv format, or [Height, Width, BGR]. The shape of the output features is [ length of the list of the input images, dim_feature of the CNN].
Available networks: alexnet, vgg11, vgg13, vgg16, vgg19, resnet18, resnet34, resnet101, resnet152, densenet121, densenet161, densenet169, densenet201 googlenet, mobilenet vit_b_16
from img2feat import BuildNet
net = BuildNet.build('vgg11')
x = net( [img] )
Methods
available_networks() -> list of string
Return the list of names of available networks.
init( network=‘vgg11’, gpu=False, img_size=(224,224) )
Constructor
network should be one of available_networks()
gpu is set True, if the GPU is available.
img_size is the image size which is input to the network, (width, height)
call( imgs ) -> feature (numpy float32)
It converts the list of images to the features.
imgs is thee list of images. The image is should be the opencv format, or [Height, Width, BGR].
feature is the converted features where [ len(imgs), dim_feature].
Variables
dim_feature (int)
It is the dimension of the output feature.
class PixelFeature
The PixelFeature converts images to per-pixel features. The feature is the numpy array of [Height, Width, Dim Feature].
Available networks: vgg11, vgg13, vgg16, vgg19,
from img2feat import CNN
net = CNN('vgg11')
x = net( [img] )
Methods
available_networks() -> list of string
Return the list of names of available networks.
init( network=‘vgg11’, gpu=False )
Constructor
network should be one of available_networks()
gpu is set True, if the GPU is available.
call( imgs ) -> list of feature (numpy float32)
It converts the list of images to the features.
imgs is thee list of images. The image is should be the opencv format, or [Height, Width, BGR].
feature is the converted features where [ height, width, dim_featute]. The height and width are same as the input image.
Variables
dim_feature (int)
It is the dimension of the output feature.
class Mirror
The Mirror provide a data augmentation of mirroring.
Methods
call( imgs ) -> augmented images
It return the augmented images. The output is the list of images. The odd is the original images and the even is the mirrored images.
Variables
nb_aug int
It return 2.
class TenCrop
The TenCrop provide a typical 10-crop data augmentation. First, images are resized so that the shorter side is a setting scale. Then, center, top-left, top-right, bottom-left, and bottom-right are cropped.
Methods
init( scales=[224, 256, 384, 480, 640], mirror=True, img_size=(224,224) )
Constructor.
scales is a list of scales. Images are resized so that the shorter side is scale.
If mirror is True, the mirroring augmentation is also applied.
img_size is cropping size.
call( imgs ) -> augmented images
It returns the augmented images.
Variables
img_size
It is the cropping size. [Width, Height]
nb_aug
It is the number of augmentation fro a single image. It is the multiplication of len(scales) * 5 * 2, if mirror is True
package antbee
It is utility package for the dataset of ants and bees in Transfer Learning for Computer Vision Tutorial.
Methods
load( squared=True, root=None ) -> ( Itrain, Ytrain ), ( Itest, Ytest )
root is the root directory of the data. If it is None, the root directory is set as the package directory.
If squared is True, only squared images are loaded. If squared is False, all images are loaded.
Itrain, Itest are lists of images.
Ytrain, Ytest are numpy array of the label. 0: ant, 1: bee.
load_squared_npy( name, root=None ) -> ( Xtrain, Ytrain ), ( Xtest, Ytest )
root is the root directory of the data. If it is None, the root directory is set as the package directory.
name is the name of CNN network.
Xtrain, Xtest are numpy array of extracted features.
Ytrain, Ytest are numpy array of the label. 0: ant, 1: bee.
Variables
str
str[0]: ‘ant’, str[1]: ‘bee’
Sample Codes
sample1.py: Linear regression.
sample2.py: Data augmentation.
Network References
AlexNet: One weird trick for parallelizing convolutional neural networks
VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
ResNet: Deep Residual Learning for Image Recognition
DenseNet: Densely Connected Convolutional Networks
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file img2feat-0.3.0.tar.gz
.
File metadata
- Download URL: img2feat-0.3.0.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7ea2466fa7d217b45e6d7dd6f1b5c0609395fd142c99bcde3977b122e770ece |
|
MD5 | 5f62e79acedbab5a8378ef5d7b83c84f |
|
BLAKE2b-256 | fa807d7f76240d0d049c3a3d90fbeca04ea7230ac8f340998d3956669791e2f6 |
File details
Details for the file img2feat-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: img2feat-0.3.0-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5bcad13e1bc9d51f30bc6bfc785226186c11a2972e70b7bdf7954b81df919ac3 |
|
MD5 | 952b0466faf5c78255d32cb4e0f33aee |
|
BLAKE2b-256 | 734b35f428c321f773cdef30bfe6780a1498d7a4019a280a133fda8b0d801006 |