A utility repo for vision dataset access and management.

These details have not been verified by PyPI

Project links

Homepage

Project description

Vision Datasets

Introduction

This repo

defines the contract for dataset for purposes such as training, visualization, and exploration
provides API for organizing and accessing datasets: DatasetHub

Dataset Contracts

DatasetManifest wraps the information about a dataset including labelmap, images (width, height, path to image), and annotations. ImageDataManifest encapsulates information about each image.
ImageDataManifest encapsulates image-specific information, such as image id, path, labels, and width/height. One thing to note here is that the image path can be
1. a local path (absolute c:\images\1.jpg or relative images\1.jpg)
2. a local path in a non-compressed zip file (absolute c:\images.zip@1.jpg or relative images.zip@1.jpg) or
3. an url
ManifestDataset is an iterable dataset class that consumes the information from DatasetManifest.

ManifestDataset is able to load the data from all three kinds of paths. Both 1. and 2. are good for training, as they access data from local disk while the 3rd one is good for data exploration, if you have the data in azure storage.

Currently, four basic types of data are supported: classification_multilabel, classification_multiclass, object_detection, image_caption. multitask type is a composition type, where one set of images has multiple sets of annotations available for different tasks, where each task can be of any basic type.

For multitask dataset, the labels stored in the ImageDataManifest is a dict mapping from task name to that task's labels. The labelmap stored in DatasetManifest is also a dict mapping from task name to that task's labels.

Creating DatasetManifest

In addition to loading a serialized DatasetManifest for instantiation, this repo currently supports two formats of data that can instantiates DatasetManifest, using DatasetManifest.create_dataset_manifest(dataset_info, usage, container_sas_or_root_dir): IRIS and COCO.

DatasetInfo as the first arg in the arg list wraps the metainfo about the dataset like the name of the dataset, locations of the images, annotation files, etc. See examples in the sections below for different data formats.

Once a DatasetManifest is created, you can create a ManifestDataset for accessing the dataset:

dataset = ManifestDataset(dataset_info, dataset_manifest, coordinates='relative')

Coco format

Here is an example with explanation of what a DatasetInfo looks like for coco format, when it is serialized into json:

    {
        "name": "sampled-ms-coco",
        "version": 1,
        "description": "A sampled ms-coco dataset.",
        "type": "object_detection",
        "format": "coco", // indicating the annotation data are stored in coco format
        "root_folder": "detection/coco2017_20200401", // a root folder for all files listed
        "train": {
            "index_path": "train.json", // coco json file for training, see next section for example
            "files_for_local_usage": [ // associated files including data such as images
                "train_images.zip"
            ]
        },
        "val": {
            "index_path": "val.json",
            "files_for_local_usage": [
                "test_images.zip"
            ]
        },
        "test": {
            "index_path": "test.json",
            "files_for_local_usage": [
                "test_images.zip"
            ]
        }
    }

Coco JSON - Image classification

Here is one example of the train.json, val.json, or test.json in the DatasetInfo above. Note that the "id" for images, annotations and categories should be consecutive integers, starting from 1. Note that our lib might work with id starting from 0, but many tools like CVAT and official COCOAPI will fail.

{
  "images": [{"id": 1, "width": 224.0, "height": 224.0, "file_name": "train_images.zip@siberian-kitten.jpg"},
              {"id": 2, "width": 224.0, "height": 224.0, "file_name": "train_images.zip@kitten 3.jpg"}],
              //  file_name is the image path, which supports three formats as described in previous section.
  "annotations": [
      {"id": 1, "category_id": 1, "image_id": 1},
      {"id": 2, "category_id": 1, "image_id": 2},
      {"id": 3, "category_id": 2, "image_id": 2}
  ],
  "categories": [{"id": 1, "name": "cat"}, {"id": 2, "name": "dog"}]
}

Coco JSON - Object detection

{
  "images": [{"id": 1, "width": 224.0, "height": 224.0, "file_name": "train_images.zip@siberian-kitten.jpg"},
              {"id": 2, "width": 224.0, "height": 224.0, "file_name": "train_images.zip@kitten 3.jpg"}],
  "annotations": [
      {"id": 1, "category_id": 1, "image_id": 1, "bbox": [10, 10, 100, 100]},
      {"id": 2, "category_id": 1, "image_id": 2, "bbox": [100, 100, 200, 200]},
      {"id": 3, "category_id": 2, "image_id": 2, "bbox": [20, 20, 200, 200]}
  ],
  "categories": [{"id": 1, "name": "cat"}, {"id": 2, "name": "dog"}]
}

bbox format should be absolute pixel position following either ltwh: [left, top, width, height] or ltrb: [left, top, right, bottom]. ltwh is the default format. To work with ltrb, please specify bbox_format to be ltrb in coco json file.

Note that

Note that we used to use ltrb as default. If your coco annotations were prepared to work with this repo before version 0.1.2. Please add "bbox_format": "ltrb" to your coco file.
Regardless of what format bboxes are stored in Coco file, when annotations are transformed into ImageDataManifest, the bbox will be unified into ltrb: [left, top, right, bottom].

Coco JSON - Image caption

Here is one example of the json file for image caption task.

{
  "images": [{"id": 1, "file_name": "train_images.zip@honda.jpg"},
              {"id": 2, "file_name": "train_images.zip@kitchen.jpg"}],
  "annotations": [
      {"id": 1, "image_id": 1, "caption": "A black Honda motorcycle parked in front of a garage."},
      {"id": 2, "image_id": 1, "caption": "A Honda motorcycle parked in a grass driveway."},
      {"id": 3, "image_id": 1, "caption": "A black Honda motorcycle with a dark burgundy seat."},
      {"id": 4, "image_id": 1, "caption": "Ma motorcycle parked on the gravel in front of a garage."},
      {"id": 5, "image_id": 1, "caption": "A motorcycle with its brake extended standing outside."},
      {"id": 6, "image_id": 2, "caption": "A picture of a modern looking kitchen area.\n"},
      {"id": 7, "image_id": 2, "caption": "A narrow kitchen ending with a chrome refrigerator."},
      {"id": 8, "image_id": 2, "caption": "A narrow kitchen is decorated in shades of white, gray, and black."},
      {"id": 9, "image_id": 2, "caption": "a room that has a stove and a icebox in it"},
      {"id": 10, "image_id": 2, "caption": "A long empty, minimal modern skylit home kitchen."}
  ],
}

Iris format

Here is an example with explanation of what a DatasetInfo looks like for iris format:

    {
        "name": "sampled-ms-coco",
        "version": 1,
        "description": "A sampled ms-coco dataset.",
        "type": "object_detection",
        "root_folder": "detection/coco2017_20200401",
        "format": "iris", // indicating the annotation data are stored in iris format
        "train": {
            "index_path": "train_images.txt", // index file for images and labels for training, example can be found in next section
            "files_for_local_usage": [
                "train_images.zip",
                "train_labels.zip"
            ],
        },
        "val": {
            "index_path": "val_images.txt",
            "files_for_local_usage": [
                "val_images.zip",
                "val_labels.zip"
            ],
        },
        "test": {
            "index_path": "test_images.txt",
            "files_for_local_usage": [
                "test_images.zip",
                "test_labels.zip"
            ],
        },
        "labelmap": "labels.txt", // includes tag names
        "image_metadata_path": "image_meta_info.txt", // includes info about image width and height
    },

Iris image classification format

Each rows in the index file (index_path) is:

<image_filepath> <comma-separated-label-indices>

Note that the class/label index should start from zero.

Example:

train_images1.zip@1.jpg 0,1,2
train_images2.zip@1.jpg 2,3
...

Iris object detection format

The index file for OD is slightly different from IC. Each rows in the index file is:

<image_filepath> <label_filepath>

Same with classification, the class/label index should start from 0.

Example for train_images.txt:

train_images.zip@1.jpg train_labels.zip@1.txt
train_images.zip@2.jpg train_labels.zip@2.txt
...

Formats and example for a label file like train_labels.zip@1.txt:

class_index left top right bottom

3 200 300 600 1200 // class_id, left, top, right, bottom
4 100 100 200 200
...

Multitask DatasetInfo

The DatasetInfo for multitask is not very different from single task. A 'tasks' section will be found in the json and the 'type' of the dataset is 'multitask'. Within each task, it wraps the info specific to that task.

Below is an example for 'iris' format, but the general idea applies to 'coco' format as well.

{
    "name": "coco-vehicle-multitask",
    "version": 1,
    "type": "multitask",
    "root_folder": "classification/coco_vehicle_multitask_20210202",
    "format": "iris",
    "tasks": {
        "vehicle_color": {
            "type": "classification_multiclass",
            "train": {
                "index_path": "train_images_VehicleColor.txt",
                "files_for_local_usage": [
                    "train_images.zip"
                ]
            },
            "test": {
                "index_path": "test_images_VehicleColor.txt",
                "files_for_local_usage": [
                    "test_images.zip"
                ]
            },
            "labelmap": "labels_VehicleColor.txt"
        },
        "vehicle_type": {
            "type": "classification_multiclass",
            "train": {
                "index_path": "train_images_VehicleType.txt",
                "files_for_local_usage": [
                    "train_images.zip"
                ]
            },
            "test": {
                "index_path": "test_images_VehicleType.txt",
                "files_for_local_usage": [
                    "test_images.zip"
                ]
            },
            "labelmap": "labels_VehicleType.txt"
        }
    }
}

Dataset management and access

Once you have multiple datasets, it is more convenient to have all the DatasetInfo in one place and instantiate DatasetManifest or even ManifestDataset by just using the dataset name, usage ( train, val ,test) and version.

This repo offers the class DatasetHub for this purpose. Once instantiated with a json including the DatasetInfo for all datasets, you can retrieve a ManifestDataset by

import pathlib

dataset_infos_json_path = 'datasets.json'
dataset_hub = DatasetHub(pathlib.Path(dataset_infos_json_path).read_text())
stanford_cars = dataset_hub.create_manifest_dataset(blob_container_sas, local_dir, 'stanford-cars', version=1, usage='train')

for img, targets, sample_idx_str in stanford_cars:
    img.show()
    img.close()
    print(targets)

Note that this hub class works with data saved in both Azure Blob container and on local disk.

If local_dir:

is provided, the hub will look for the resources locally and download the data (files included in " files_for_local_usage", the index files, metadata (if iris format), labelmap (if iris format)) from blob_container_sas if not present locally
is NOT provided (i.e. None), the hub will create a manifest dataset that directly consumes data from the blob indicated by blob_container_sas. Note that this does not work, if data are stored in zipped files. You will have to unzip your data in the azure blob. (Index files requires no update, if image paths are for zip files: "a.zip@1.jpg"). This kind of azure-based dataset is good for large dataset exploration, but can be slow for training.

When data exists on local disk, blob_container_sas can be None.

Training with PyTorch

Training with PyTorch is easy. After instantiating a ManifestDataset, simply passing it in vision_datasets.pytorch.torch_dataset.TorchDataset together with the transform, then you are good to go with the PyTorch DataLoader for training.

Managing datasets with DatasetHub on cloud storage

If you are using DatasetHub to manage datasets in cloud storage, we recommend zipping (with uncompressed mode) the images into one or multiple zip files before uploading it and update the file path in index files to be like train.zip@1.jpg from train\1.jpg. You can do it with 7zip (set compression level to 'store') on Windows or zip command on Linux.

If you upload folders of images directly to cloud storage:

you will have to list all images in "files_for_local_usage", which can be millions of entries
downloading images one by one (even with multithreading) is much slower than downloading a few zip files

One more thing is that sometimes when you create a zip file train.zip, you might find out that there is only one train folder in the zip. This will fail the file loading if the path is train.zip@1.jpg, as the image is actually at train.zip@train\1.jpg. It is usually a good idea to avoid this extra layer of folder when zipping and double-confirm this does not happen by mistake.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.21

Mar 24, 2025

1.0.20

Feb 20, 2025

1.0.19

Nov 6, 2024

1.0.18

Oct 18, 2024

1.0.17

Oct 2, 2024

1.0.16

Sep 24, 2024

1.0.15

Sep 10, 2024

1.0.14

Aug 26, 2024

1.0.13

Aug 2, 2024

1.0.12

Feb 2, 2024

1.0.11

Nov 23, 2023

1.0.10

Nov 16, 2023

1.0.9

Oct 19, 2023

1.0.8

Sep 29, 2023

1.0.7

Sep 14, 2023

1.0.6

Sep 11, 2023

1.0.5

Sep 11, 2023

1.0.4

Sep 1, 2023

1.0.3

Aug 20, 2023

1.0.2

Jul 12, 2023

1.0.1

Jul 10, 2023

1.0.0

Jun 28, 2023

0.2.29

Jun 20, 2023

0.2.28

May 19, 2023

0.2.27

Apr 11, 2023

0.2.26

Mar 8, 2023

0.2.25

Feb 27, 2023

0.2.24

Feb 11, 2023

0.2.23

Nov 4, 2022

0.2.22

Nov 2, 2022

0.2.20

Oct 14, 2022

0.2.19

Sep 30, 2022

0.2.18

Sep 22, 2022

0.2.17

Aug 25, 2022

0.2.16

Aug 23, 2022

0.2.15

Aug 17, 2022

0.2.14

Jul 22, 2022

0.2.13

Jun 30, 2022

0.2.12

Jun 17, 2022

0.2.11

May 7, 2022

0.2.10

Apr 6, 2022

0.2.9

Mar 24, 2022

0.2.8

Mar 22, 2022

0.2.7

Feb 17, 2022

0.2.6

Feb 14, 2022

0.2.5

Feb 9, 2022

0.2.4

Feb 7, 2022

0.2.3

Feb 3, 2022

0.2.2

Jan 26, 2022

0.2.1

Jan 20, 2022

0.2.0

Jan 6, 2022

This version

0.1.9

Dec 15, 2021

0.1.8

Dec 13, 2021

0.1.7

Dec 8, 2021

0.1.6

Nov 24, 2021

0.1.5

Nov 24, 2021

0.1.4

Nov 19, 2021

0.1.3

Oct 9, 2021

0.1.2

Sep 30, 2021

0.1.1

Sep 24, 2021

0.1.0

Aug 24, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_datasets-0.1.9.tar.gz (37.0 kB view details)

Uploaded Dec 15, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vision_datasets-0.1.9-py3-none-any.whl (34.9 kB view details)

Uploaded Dec 15, 2021 Python 3

File details

Details for the file vision_datasets-0.1.9.tar.gz.

File metadata

Download URL: vision_datasets-0.1.9.tar.gz
Upload date: Dec 15, 2021
Size: 37.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.9

File hashes

Hashes for vision_datasets-0.1.9.tar.gz
Algorithm	Hash digest
SHA256	`2f212fec174ecc6beb7ca0dca8af1d97a6ebf51a81dee06164e1cbf721ee8fc2`
MD5	`d507ba3747e003b5357d95d78c863fbd`
BLAKE2b-256	`410808601ab6a1c0ad7b24c9d5d4da835e4f1f815a98ba62fc4d5381d003417f`

See more details on using hashes here.

File details

Details for the file vision_datasets-0.1.9-py3-none-any.whl.

File metadata

Download URL: vision_datasets-0.1.9-py3-none-any.whl
Upload date: Dec 15, 2021
Size: 34.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.9

File hashes

Hashes for vision_datasets-0.1.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`45cf53234c28b0174fdc77626ae3bc1ed934d6f04bb6d2f5fbf47b272f3a8f43`
MD5	`f127e3f379e0a0752cb698874ca19350`
BLAKE2b-256	`a7cd2a3127c39be6471ac504032df89b022786cae808eb465a13d1970314265e`

See more details on using hashes here.

vision-datasets 0.1.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Vision Datasets

Introduction

Dataset Contracts

Creating DatasetManifest

Coco format

Coco JSON - Image classification

Coco JSON - Object detection

Coco JSON - Image caption

Iris format

Iris image classification format

Iris object detection format

Multitask DatasetInfo

Dataset management and access

Training with PyTorch

Managing datasets with DatasetHub on cloud storage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes