Skip to main content

A small package to ease working with the YCB-Video dataset

Project description

ycbvideo

Python package for loading the data from the YCB-Video Dataset. It's a very large dataset made for computer vision task like 6D object pose estimation or semantic segmentation. You can find more information and a download link for the dataset here.

It allows access to the frames, located either in data or the data_syn folders. A frame here corresponds to all the information available for one portion of time, i.e. not only the color image, but the color/depth/label images, the data from the *-meta.mat files and for the frames in data also the bounding box coordinates. Frames are grouped in frame sequences of consecutive frames. Frames and frame sequences can be specified by frame selection expressions.

Example:

Frame 42 from frame sequence 7 corresponds to the data from the following files

  • data/0007/000042-color.png
  • data/0007/000042-depth.png
  • data/0007/000042-label.png
  • data/0007/000042-box.txt
  • data/0007/000042-meta.mat

and can be specified by the frame selection expression 7/42 to just specify this single frame. If you'd like to specify e.g. the 42th frame for each available frame sequence, you could express this by */42. If you're interested in only the 42th frame from the frame sequences 7, 10 and 17, you'd use [7,10,17]/42. If not only the 42th frame is interesting for you, but all frames from 42 up to the 55th frame, specifying a range instead would work. A range is specified similar to how slicing works in Python. [7,10,17]/42:56 would work in our example. The range starts at the 42th frame (inclusive) and stops just before the 56th frame (exclusive). Per default, the stepsize is 1, 42:56:2 would select every other frame and 42:56:-1 would give you the frames in reverse order. You can provide multiple of these expressions and receive the frames in the order of your expressions. Have a look at the section Expressions or the documentation for the Loader.frames() method for more ways and examples of how to specify the frames you're interested in.

The data_syn directory is also handled as a frame sequence, */42 therefore would also include the 42th frame from the data_syn frame sequence. Leaving exactly this one out can be achieved with data/42, i.e. gets you the 42th frame for each frame sequence except the data_syn frame sequence. Contrarily, data_syn/42 gets you just this single frame.

Because the dataset is huge (~273 GB), it wouldn't make much sense to load all the data into memory at once, therefore, frames are loaded one at a time. Especially in case your working with just a subset of all the data from the dataset, e.g. having only the frame sequences 0001 - 0010 on your disk, you'd want to be sure that the data is really available when you start working with the frames. Therefore, after you specify all the frames you need by frame selection expression(s), an automatic check is made to ensure that all the frames are on your disk and you didn't forget to put e.g. frame sequence 0010 on your disk. This is especially helpful since the dataset is of most use for machine learning tasks. Getting a "file not found" error after hours of training could be very frustrating.

Installation

It's published at PYPI, just use pip install ycbvideo to install it. Python >= 3.8 is required.

Usage

First, import the package and create a loader. Provide the path to the dataset directory. Do not modify the data afterwards!

import ycbvideo

loader = ycbvideo.Loader('/path/to/data')

Accessing Frames

You can specify the frames by frame selection expressions, either via a list of such expressions or by providing a file comprised of those expressions, one expression per line.

  • Via a list

    frames = loader.frames(['data_syn/1',
                            '1/[2,4,5]',
                            '42/[3,4]',
                            '42:56/[2,3]',
                            '[2,3,4]/*',
                            '*/:56:-1',
                            '*/*'])
    
  • Via a file

    # frames.txt
    
    data_syn/1
    1/[2,4,5]
    42/[3,4]
    42:56/[2,3]
    [2,3,4]/*
    */:56:-1
    */*
    
    frames = loader.frames('/path/to/frames.txt')
    

    If you provide a relative path, it is assumed that the file is located inside the dataset directory, e.g. imagesets/train.txt.

The object returned by loader.frames() shares some behaviour with the Python List type, specifically supporting iteration, the len() Python builtin and index-based element access:

  • Iterate over all elements:

    # create either an iterator using iter()
    iterator = iter(loader.frames(...))
    
    # or use a for loop
    for frame in loader.frames(...):
        # do something with the frame
    
  • Determine the number of frames:

    frames = loader.frames(...)
    
    len(frames)
    
  • Access the element at a specific index:

    frames = loader.frames(...)
    
    frame = frames[42]
    

Keep in mind that the returned object is not really a list and it's functional dependent from the loader, from which it was returned, so don't delete the loader or modify its internal state afterwards.

If you want the frames to be shuffled for e.g. training in machine learning, just set the corresponding keyword argument to True. Optionally, you can set a seed to get the same shuffling result for each run:

# setting the seed
random.seed(42)

loader.frames(frames=..., shuffle=True)

Customizing iteration

In order to customize iteration of frames to your needs, you can, instead of accessing the frames directly, access frames one-by-one by specifying their corresponding description.

# get descriptors by specifying selection expressions
# for the corresponding frames
# expression_source can again be a list or a path to a file
descriptions = loader.get_descriptors(expression_source=...)

# or provide descriptions yourself
# a description is simply a tuple: (sequence_index, frame_index)
descriptions = [('42', '1'), ('5', '13'), ...]

# iterate over the frames specified by your descriptions
for description in descriptions:
  frame = loader.get_frame(description)

  # do something with the frame

Expressions in detail

Selection expressions consist of two parts: An expression for specifying one or more frame sequences and an expression for specifying one or more frames. A / combines both parts: <FRAME_SEQUENCE_SELECTION>/<FRAME_SELECTION>. Most expressions are valid for both frame sequences and frames:

  • 42: Selects a single element 42 ("Single element expression")
  • [42,56,47]: Selects a list of elements, the elements 42, 56 and 47 in exactly the order specified ("List expression")
  • 42:47:1: Selects the elements between element 42 (inclusive) up to element 47 (exclusive), i.e. the elements 42, 43, 44, 45 and 46 ("Range expression")
  • *: Selects all elements in ascending order ("Star expression")

Two "single element expressions" only apply to the selection of frame sequences:

  • data_syn: Selects the frame sequence data_syn (i.e. the data_syn directory)
  • data: Selects all frame sequences except the data_syn frame sequence (i.e. all the subdirectories in the data directory)

"List expressions" and "range expressions" can only contain "numbered" elements like 42 or 47, not data_syn nor data.

"Range expressions" are quite comparable to the slicing operation in Python. Given the expression <START>:<STOP>:<STEP>, all the elements START, STOP and STEP are optional. If START is omitted, START equals the smallest available element, if STOP is omitted, all remaining available successive elements are included in the range. If STEP is omitted, the step size equals 1. START and STOP both have to be positive integers, STEP might also be a negative integer, which then would lead to reverse order of the specified elements. Step sizes other than 1 or -1 are also allowed. Obviously, a step size of 0 is not allowed.

Missing or incomplete data

Missing data

Working with only a subset of the dataset is no problem.

  • Using a "star expression", you just get all available elements
  • Using a "range expression", elements in between your range might not be there, for instance, if elements 42, 43 and 45 are there, but element 44 is missing and you specify 42:45, you would get the elements 42 and 43. You only have to make sure that if you specify elements as start and/or stop, these elements are available.

So, in short, every time you "name" an element, it has to be there! Using a "list expression" [42,43,44,45] in the former example would instantly result in an error, since element 44 is not available. A "Star expressions" would not complain. Also, "range expressions", where the start and stop or both are omitted (e.g 42:, :45 or :) will not complain since all "named" elements are available. Be aware this also means, that you are responsible for making sure that all elements you expect to be selected are in fact on your disk.

Incomplete (frame) data

If all files corresponding to a frame are on your disk, the frame is "complete" and might be loaded. If it misses at least one of the files, it is considered "incomplete" and cannot be loaded. This not only applies to "named" (frame) elements (like with missing data), but also to implicitly specified elements like in a range expression. An attempt to select 42:45, when frame 42, 43 and 45 are there and complete but frame 44 is incomplete , will therefore fail.

By running

python -m ycbvideo /path/to/data

you can manually inspect and check the integrity of the portion of the dataset on your disk. It will show you which and how many frame sequences and frames are on your disk and which files might be missing.

Roadmap

  • Perform tests also on Windows
  • Make other data from the dataset easily accessible where useful
  • Test, build and publish the package by using GitHub Actions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ycbvideo-0.6.0.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

ycbvideo-0.6.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file ycbvideo-0.6.0.tar.gz.

File metadata

  • Download URL: ycbvideo-0.6.0.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for ycbvideo-0.6.0.tar.gz
Algorithm Hash digest
SHA256 1c527c1d64d524888953916d0cb2d6b3c4d98c00f9c34e4eb8e40b9c8d2774de
MD5 a47474f2100f68b261c4a7ddfbd939b7
BLAKE2b-256 2cf3abaac66fe4c7ab47eee07587f2ca79774241e64f411da3e2138ffb73adff

See more details on using hashes here.

File details

Details for the file ycbvideo-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: ycbvideo-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for ycbvideo-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b82a0c1d43dadf0f458b93c37681835e068d37eaf3a45e9568b937ce37ca8d96
MD5 036674ba3cb515f46605ee954c43730e
BLAKE2b-256 d98cd2028153b76234083fbe00674c1edc6385538d3fa7e9de5485d629546e98

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page