Skip to main content

Tools for downloading computer vision datasets from Google's OpenImages dataset

Project description

openimages

Tools for downloading images and corresponding annotations from Google's OpenImages dataset.

Download images and annotations

The openimages package contains a download module which provides an API with two download functions and a corresponding CLI (command line interface) including script entry points that can be used to perform downloading of images and corresponding annotations from the OpenImages dataset.

Public API
  • openimages.download.download_images for downloading images only

    For example, to download all images for the two classes "Hammer" and "Scissors" into the directories "/dest/dir/Hammer/images" and "/dest/dir/Scissors/images":

    from openimages.download import download_images
    download_images("/dest/dir", ["Hammer", "Scissors",])
    
  • openimages.download.download_dataset for downloading images and corresponding annotations For example, to download all images and corresponding annotations in PASCAL VOC format for the two classes "Hammer" and "Scissors" into the directories "/dest/dir/Hammer/[images|pascal]" and "/dest/dir/Scissors/[images|pascal]":

    from openimages.download import download_dataset
    download_dataset("/dest/dir", ["Hammer", "Scissors",], annotation_format="pascal")
    
Command Line Interface

Two Python script entry points are installed when the package is installed into a Python environment, corresponding to the public API functions described above: oi_download_dataset and oi_download_images. These commands use the follwing options:

Option Required Description
--base_dir <dir> yes directory into which images and annotations will be downloaded, with each class label having a separate subdirectory containing an "images" subdirectory for image files and (for annotated datasets) an <annotation_format> subdirectory for annotation files
--labels <label1> [<label_2> ...] yes space-separated list of class labels, at least one required, multi-word labels with spaces must be quoted
--format <annotation_format> for annotated dataset yes, not applicable for images only required for downloading an annotated dataset, currently supported format specifiers are "darknet" and "pascal"
--csv_dir <dir> no, but usually recommended directory into which the CSV files specifying annotations and class labels are downloaded (if not already present) or read from (if present)
--exclusions <file> no text file containing image file IDs, one per line, for images to be excluded from the final dataset, useful in cases when images have been identified as problematic
--limit <int> no the upper limit on the number of images to be downloaded per label class
NOTE:

If you'll use these commands more than once then it's imperative to utilize the --csv_dir option that specifies where to save the (rather large) CSV file containing bounding box information etc., as this will save you from having to redownload this large file in subsequent usages.

Usage examples

Download images and PASCAL format annotations for the class labels "Scissors" and "Hammer", limiting the number of images to 200 and storing the CSV files under ~/openimages (reading the CSV files from there if they already exist):

$ oi_download_dataset --csv_dir ~/openimages --base_dir ~/openimages --labels Scissors Hammer --format pascal --limit 100

Download images only for the class label "Scissors", limiting the number of images to 100 and storing the CSV files under ~/openimages (reading the CSV files from there if they already exist):

$ oi_download_images --csv_dir ~/openimages --base_dir ~/openimages --labels Scissors --limit 100

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openimages-0.0.1.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

openimages-0.0.1-py2.py3-none-any.whl (10.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file openimages-0.0.1.tar.gz.

File metadata

  • Download URL: openimages-0.0.1.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1

File hashes

Hashes for openimages-0.0.1.tar.gz
Algorithm Hash digest
SHA256 111f7a304d6ca7c8ce4b05c1a4ea63b08b669b1a851daa12c9beb43ae15ebe08
MD5 bdd1ebf4c17322019369c18bf4830b83
BLAKE2b-256 951778a81a2f75783770593f9166f68056fa3e3ff595356476fd6a301c18feda

See more details on using hashes here.

File details

Details for the file openimages-0.0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: openimages-0.0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1

File hashes

Hashes for openimages-0.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 865fbe6c4cb3cdd7adeb2d421ee5919deab534ed182c5a419efc66d780cd9b6a
MD5 cd188636a9bd0f9d43b8dd66583d011a
BLAKE2b-256 49ba587944c183999aa9a0416d6979739b78adfe021eee74aa9db78f0beaea06

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page