Skip to main content

Python API to work with the Visual Wake Words Dataset.

Project description

Visual Wake Words Dataset

Python library to work with the Visual Wake Words Dataset, comparable to pycococools for the COCO dataset.

pyvww.utils.VisualWakeWords inherits from pycocotools.coco.COCO and can be used in an similar fashion.

pyvww.pytorch.VisualWakeWordsClassification is a pytorch Dataset which can be used like any image classification dataset.


Installation

The code is implemented in Python 3.7 and can be installed with pip:

pip install pyvww

Usage

The Visual Wake Words Dataset is derived from the publicly available COCO dataset. To download the COCO dataset use the script download_coco.sh

bash scripts/download_mscoco.sh path-to-COCO-dataset year

Where year is an optional argument that can be either 2014 (default) or 2017.

The Visual Wake Words Dataset evaluates the accuracy on the minival image ids, and for training uses the remaining 115k images of the COCO training/validation dataset.

To create COCO annotation files that converts the 2014 or 2017 split to the minival split use: scripts/create_coco_train_minival_split.py

TRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_train2014.json"
VAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_val2014.json"
DIR="path-to-mscoco-dataset/annotations/"
python scripts/create_coco_train_minival_split.py \
  --train_annotations_file="${TRAIN_ANNOTATIONS_FILE}" \
  --val_annotations_file="${VAL_ANNOTATIONS_FILE}" \
--output_dir="${DIR}"

(2014 can be replaced by 2017 if you downloaded the 2017 dataset)

The process of creating the Visual Wake Words dataset from COCO dataset is as follows. Each image is assigned a label 1 or 0. The label 1 is assigned as long as it has at least one bounding box corresponding to the object of interest (e.g. person) with the box area greater than a certain threshold (e.g. 0.5% of the image area).

To generate the new annotations, use the script scripts/create_visualwakewords_annotations.py.

MAXITRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_maxitrain.json"
MINIVAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_minival.json"
VWW_OUTPUT_DIR="new-path-to-visualwakewords-dataset/annotations/"
python scripts/create_visualwakewords_annotations.py \
  --train_annotations_file="${MAXITRAIN_ANNOTATIONS_FILE}" \
  --val_annotations_file="${MINIVAL_ANNOTATIONS_FILE}" \
  --output_dir="${VWW_OUTPUT_DIR}" \
  --threshold=0.005 \
  --foreground_class='person'

The generated annotations follow the COCO Data format.

{
  "info" : info, 
  "images" : [image], 
  "annotations" : [annotation], 
  "licenses" : [license],
}

info{
  "year" : int, 
  "version" : str, 
  "description" : str, 
  "url" : str, 
}

image{
  "id" : int, 
  "width" : int, 
  "height" : int, 
  "file_name" : str, 
  "license" : int, 
  "flickr_url" : str, 
  "coco_url" : str, 
  "date_captured" : datetime,
}

license{
  "id" : int, 
  "name" : str, 
  "url" : str,
}

annotation{
  "id" : int, 
  "image_id" : int, 
  "category_id" : int, 
  "area" : float, 
  "bbox" : [x,y,width,height], 
  "iscrowd" : 0 or 1,
}

Pytorch Dataset

The pyvww.pytorch.VisualWakeWordsClassification can be used in pytorch like any other pytorch image classification dataset such as MNIST or ImageNet.

import torch
import pyvww

train_dataset = pyvww.pytorch.VisualWakeWordsClassification(root="path-to-mscoco-dataset/all", 
                    annFile=".../visualwakewords/annotations/instances_train.json")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyvww-0.1.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

pyvww-0.1.1-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file pyvww-0.1.1.tar.gz.

File metadata

  • Download URL: pyvww-0.1.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for pyvww-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5d4b9d3cf2c0ffed130c659c933fdb01278f2b0affd47818b971846e7736aa56
MD5 e561016cec189fecad88a1a8fad9d01b
BLAKE2b-256 e78ae1c8cf7e6f35c051d4de9c3b2285582bc7f81d882a55b5288332340a15c9

See more details on using hashes here.

File details

Details for the file pyvww-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pyvww-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for pyvww-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eb721e3d5f73d176f6262e7c04c1d12fb92aa3fa8f95c2b5a9a3c77bd0662eca
MD5 22e592f5b559925c5ddf11ceb9dc37bd
BLAKE2b-256 b3ab1bf6a32048b845bd84e445413279c36fd0716d65cbab848d3deb1abca5a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page