torchvision

image and video datasets and models for torch deep learning

These details have been verified by PyPI

Maintainers

atalman ezyang facebook malfet seemethere soumith

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

Project description

torch-vision

This repository consists of:

vision.datasets : Data loaders for popular vision datasets
vision.models : Definitions for popular model architectures, such as AlexNet, VGG, and ResNet and pre-trained models.
vision.transforms : Common image transformations such as random crop, rotations etc.
vision.utils : Useful stuff such as saving tensor (3 x H x W) as image to disk, given a mini-batch creating a grid of images, etc.

Installation

Binaries:

conda install torchvision -c https://conda.anaconda.org/t/6N-MsQ4WZ7jo/soumith

>From Source:

pip install -r requirements.txt
pip install .

Datasets

The following dataset loaders are available:

COCO (Captioning and Detection)
LSUN Classification
ImageFolder
Imagenet-12
CIFAR10 and CIFAR100

Datasets have the API: - __getitem__ - __len__ They all subclass from torch.utils.data.Dataset Hence, they can all be multi-threaded (python multiprocessing) using standard torch.utils.data.DataLoader.

For example:

torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)

In the constructor, each dataset has a slightly different API as needed, but they all take the keyword args:

transform - a function that takes in an image and returns a transformed version
common stuff like ToTensor, RandomCrop, etc. These can be composed together with transforms.Compose (see transforms section below)
target_transform - a function that takes in the target and transforms it. For example, take in the caption string and return a tensor of word indices.

COCO

This requires the COCO API to be installed

Captions:

dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])

Example:

import torchvision.datasets as dset
import torchvision.transforms as transforms
cap = dset.CocoCaptions(root = 'dir where images are',
                        annFile = 'json annotation file',
                        transform=transforms.ToTensor())

print('Number of samples: ', len(cap))
img, target = cap[3] # load 4th sample

print("Image Size: ", img.size())
print(target)

Output:

Number of samples: 82783
Image Size: (3L, 427L, 640L)
[u'A plane emitting smoke stream flying over a mountain.',
u'A plane darts across a bright blue sky behind a mountain covered in snow',
u'A plane leaves a contrail above the snowy mountain top.',
u'A mountain that has a plane flying overheard in the distance.',
u'A mountain view with a plume of smoke in the background']

Detection:

dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])

LSUN

dset.LSUN(db_path, classes='train', [transform, target_transform])

db_path = root directory for the database files
classes =
‘train’ - all categories, training set
‘val’ - all categories, validation set
‘test’ - all categories, test set
[‘bedroom_train’, ‘church_train’, …] : a list of categories to load

CIFAR

dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)

dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)

root : root directory of dataset where there is folder cifar-10-batches-py
train : True = Training set, False = Test set
download : True = downloads the dataset from the internet and puts it in root directory. If dataset already downloaded, does not do anything.

ImageFolder

A generic data loader where the images are arranged in this way:

root/dog/xxx.png
root/dog/xxy.png
root/dog/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/asd932_.png

dset.ImageFolder(root="root folder path", [transform, target_transform])

It has the members:

self.classes - The class names as a list
self.class_to_idx - Corresponding class indices
self.imgs - The list of (image path, class-index) tuples

Imagenet-12

This is simply implemented with an ImageFolder dataset.

The data is preprocessed as described here

Here is an example.

Models

The models subpackage contains definitions for the following model architectures:

AlexNet: AlexNet variant from the “One weird trick” paper.
VGG: VGG-11, VGG-13, VGG-16, VGG-19 (with and without batch normalization)
ResNet: ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152

You can construct a model with random weights by calling its constructor:

import torchvision.models as models
resnet18 = models.resnet18()
alexnet = models.alexnet()

We provide pre-trained models for the ResNet variants and AlexNet, using the PyTorch model zoo. These can be constructed by passing pretrained=True:

python import torchvision.models as models resnet18 = models.resnet18(pretrained=True) alexnet = models.alexnet(pretrained=True)

Transforms

Transforms are common image transforms. They can be chained together using transforms.Compose

transforms.Compose

One can compose several transforms together. For example.

transform = transforms.Compose([
    transforms.RandomSizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                          std = [ 0.229, 0.224, 0.225 ]),
])

Transforms on PIL.Image

Scale(size, interpolation=Image.BILINEAR)

Rescales the input PIL.Image to the given ‘size’. ‘size’ will be the size of the smaller edge.

For example, if height > width, then image will be rescaled to (size * height / width, size) - size: size of the smaller edge - interpolation: Default: PIL.Image.BILINEAR

CenterCrop(size) - center-crops the image to the given size

Crops the given PIL.Image at the center to have a region of the given size. size can be a tuple (target_height, target_width) or an integer, in which case the target will be of a square shape (size, size)

RandomCrop(size, padding=0)

Crops the given PIL.Image at a random location to have a region of the given size. size can be a tuple (target_height, target_width) or an integer, in which case the target will be of a square shape (size, size) If padding is non-zero, then the image is first zero-padded on each side with padding pixels.

RandomHorizontalFlip()

Randomly horizontally flips the given PIL.Image with a probability of 0.5

RandomSizedCrop(size, interpolation=Image.BILINEAR)

Random crop the given PIL.Image to a random size of (0.08 to 1.0) of the original size and and a random aspect ratio of 3/4 to 4/3 of the original aspect ratio

This is popularly used to train the Inception networks - size: size of the smaller edge - interpolation: Default: PIL.Image.BILINEAR

Pad(padding, fill=0)

Pads the given image on each side with padding number of pixels, and the padding pixels are filled with pixel value fill. If a 5x5 image is padded with padding=1 then it becomes 7x7

Transforms on torch.*Tensor

Normalize(mean, std)

Given mean: (R, G, B) and std: (R, G, B), will normalize each channel of the torch.*Tensor, i.e. channel = (channel - mean) / std

Conversion Transforms

ToTensor() - Converts a PIL.Image (RGB) or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
ToPILImage() - Converts a torch.*Tensor of range [0, 1] and shape C x H x W or numpy ndarray of dtype=uint8, range[0, 255] and shape H x W x C to a PIL.Image of range [0, 255]

Generic Transofrms

Lambda(lambda)

Given a Python lambda, applies it to the input img and returns it. For example:

transforms.Lambda(lambda x: x.add(10))

Utils

make_grid(tensor, nrow=8, padding=2)

Given a 4D mini-batch Tensor of shape (B x C x H x W), makes a grid of images

save_image(tensor, filename, nrow=8, padding=2)

Saves a given Tensor into an image file.

If given a mini-batch tensor, will save the tensor as a grid of images.

Project details

These details have been verified by PyPI

Maintainers

atalman ezyang facebook malfet seemethere soumith

Release history Release notifications | RSS feed

0.25.0

Jan 21, 2026

0.24.1

Nov 12, 2025

0.24.0

Oct 15, 2025

0.23.0

Aug 6, 2025

0.22.1

Jun 4, 2025

0.22.0

Apr 23, 2025

0.21.0

Jan 29, 2025

0.20.1

Oct 29, 2024

0.20.0

Oct 17, 2024

0.19.1

Sep 4, 2024

0.19.0

Jul 24, 2024

0.18.1

Jun 5, 2024

0.18.0

Apr 24, 2024

0.17.2

Mar 27, 2024

0.17.1

Feb 22, 2024

0.17.0

Jan 30, 2024

0.16.2

Dec 14, 2023

0.16.1

Nov 15, 2023

0.16.0

Oct 4, 2023

0.15.2

May 8, 2023

0.15.1

Mar 15, 2023

0.15.0 yanked

Mar 15, 2023

Reason this release was yanked:

Contains an incorrect torch dependency

0.14.1

Dec 15, 2022

0.14.0

Oct 28, 2022

0.13.1

Aug 5, 2022

0.13.0

Jun 28, 2022

0.12.0

Mar 10, 2022

0.11.3

Jan 27, 2022

0.11.2

Dec 15, 2021

0.11.1

Oct 21, 2021

0.11.0 yanked

Oct 21, 2021

Reason this release was yanked:

Dependency issue, depends on a version of torch that does not exist on pypi

0.10.1

Sep 20, 2021

0.10.0

Jun 15, 2021

0.9.1

Mar 25, 2021

0.9.0

Mar 4, 2021

0.8.2

Dec 10, 2020

0.8.1

Oct 27, 2020

0.8.0

Oct 27, 2020

0.7.0

Jul 28, 2020

0.6.1

Jun 18, 2020

0.6.0

Apr 21, 2020

0.5.0

Jan 15, 2020

0.4.2

Nov 7, 2019

0.4.1.post2

Oct 22, 2019

0.4.1

Oct 10, 2019

0.4.0

Aug 8, 2019

0.3.0

May 22, 2019

0.2.2.post3 yanked

Mar 1, 2019

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.2.2.post2 yanked

Feb 28, 2019

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.2.2 yanked

Feb 27, 2019

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.2.1 yanked

Apr 24, 2018

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.2.0 yanked

Dec 5, 2017

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.1.9 yanked

Aug 6, 2017

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.1.8 yanked

Mar 29, 2017

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

This version

0.1.7 yanked

Jan 19, 2017

Reason this release was yanked:

So that users won't accidentally install this when using python 3.11

0.1.6 yanked

Jan 18, 2017

Reason this release was yanked:

0.1.6 is past it's support date and confuses users on unsupported platforms

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

torchvision-0.1.7-py2.py3-none-any.whl (23.3 kB view details)

Uploaded Jan 19, 2017 Python 2Python 3

File details

Details for the file torchvision-0.1.7-py2.py3-none-any.whl.

File metadata

Download URL: torchvision-0.1.7-py2.py3-none-any.whl
Upload date: Jan 19, 2017
Size: 23.3 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for torchvision-0.1.7-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`ee509d6b1cb10be6daad4b7e72712d914cae5f7edca61737795e0d7d41c17826`
MD5	`ebafe580c103f71e245d3d8c013d771b`
BLAKE2b-256	`0fc0262ab7e4ff08c3c1c74e02285bc225e8135b25ea527b762f3d690417a657`

See more details on using hashes here.

torchvision 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

torch-vision

Installation

Datasets

COCO

Captions:

Detection:

LSUN

CIFAR

ImageFolder

Imagenet-12

Models

Transforms

transforms.Compose

Transforms on PIL.Image

Scale(size, interpolation=Image.BILINEAR)

CenterCrop(size) - center-crops the image to the given size

RandomCrop(size, padding=0)

RandomHorizontalFlip()

RandomSizedCrop(size, interpolation=Image.BILINEAR)

Pad(padding, fill=0)

Transforms on torch.*Tensor

Conversion Transforms

Generic Transofrms

Utils

make_grid(tensor, nrow=8, padding=2)

save_image(tensor, filename, nrow=8, padding=2)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes