Skip to main content

Synthetic dataset insights.

Project description

Dataset Insights

PyPI python PyPI version Downloads Tests License

Unity Dataset Insights is a python package for downloading, parsing and analyzing synthetic datasets generated using the Unity Perception package.

Installation

Datasetinsights is published to PyPI. You can simply run pip install datasetinsights command under a supported python environments:

Getting Started

Dataset Statistics

We provide a sample notebook to help you load synthetic datasets generated using Perception package and visualize dataset statistics. We plan to support other sample Unity projects in the future.

Load Datasets

The Unity Perception package provides datasets under this schema. The datasetinsighs package also provide convenient python modules to parse datasets.

For example, you can load AnnotationDefinitions into a python dictionary by providing the corresponding annotation definition ID:

from datasetinsights.datasets.unity_perception import AnnotationDefinitions

annotation_def = AnnotationDefinitions(data_root=dest, version="my_schema_version")
definition_dict = annotation_def.get_definition(def_id="my_definition_id")

Similarly, for MetricDefinitions:

from datasetinsights.datasets.unity_perception import MetricDefinitions

metric_def = MetricDefinitions(data_root=dest, version="my_schema_version")
definition_dict = metric_def.get_definition(def_id="my_definition_id")

The Captures table provide the collection of simulation captures and annotations. You can load these records directly as a Pandas DataFrame:

from datasetinsights.datasets.unity_perception import Captures

captures = Captures(data_root=dest, version="my_schema_version")
captures_df = captures.filter(def_id="my_definition_id")

The Metrics table can store simulation metrics for a capture or annotation. You can also load these records as a Pandas DataFrame:

from datasetinsights.datasets.unity_perception import Metrics

metrics = Metrics(data_root=dest, version="my_schema_version")
metrics_df = metrics.filter_metrics(def_id="my_definition_id")

Download Datasets

You can download the datasets using the download command:

datasetinsights download --source-uri=<xxx> --output=$HOME/data

The download command supports HTTP(s), and GCS.

Alternatively, you can download dataset directly from python interface.

GCSDatasetDownloader can download a dataset from GCS locations.

from datasetinsights.io.downloader import GCSDatasetDownloader

source_uri=gs://url/to/file.zip # or gs://url/to/folder
dest = "~/data"
downloader = GCSDatasetDownloader()
downloader.download(source_uri=source_uri, output=dest)

HTTPDatasetDownloader can a dataset from any HTTP(S) url.

from datasetinsights.io.downloader import HTTPDatasetDownloader

source_uri=http://url.to.file.zip
dest = "~/data"
downloader = HTTPDatasetDownloader()
downloader.download(source_uri=source_uri, output=dest)

Convert Datasets

If you are interested in converting the synthetic dataset to COCO format for annotations that COCO supports, you can run the convert command:

datasetinsights convert -i <input-directory> -o <output-directory> -f COCO-Instances

or

datasetinsights convert -i <input-directory> -o <output-directory> -f COCO-Keypoints

You will need to provide 2D bounding box definition ID in the synthetic dataset. We currently only support 2D bounding box and human keypoint annotations for COCO format.

Docker

You can use the pre-build docker image unitytechnologies/datasetinsights to interact with datasets.

Documentation

You can find the API documentation on readthedocs.

Contributing

Please let us know if you encounter a bug by filing an issue. To learn more about making a contribution to Dataset Insights, please see our Contribution page.

License

Dataset Insights is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Citation

If you find this package useful, consider citing it using:

@misc{datasetinsights2020,
    title={Unity {D}ataset {I}nsights Package},
    author={{Unity Technologies}},
    howpublished={\url{https://github.com/Unity-Technologies/datasetinsights}},
    year={2020}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasetinsights-1.1.2.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

datasetinsights-1.1.2-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file datasetinsights-1.1.2.tar.gz.

File metadata

  • Download URL: datasetinsights-1.1.2.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.12 Linux/5.13.0-1021-azure

File hashes

Hashes for datasetinsights-1.1.2.tar.gz
Algorithm Hash digest
SHA256 e9478d74b442044aaae18c23ce4f2bec14fcd98970ba5b8a3cb8510279f614c6
MD5 035d9d10113c976f7645f2125a8682ca
BLAKE2b-256 77c4cc45778fe8777350e47f6c8de2885236c0a0be513b1dd1ba3658f6718828

See more details on using hashes here.

File details

Details for the file datasetinsights-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: datasetinsights-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.12 Linux/5.13.0-1021-azure

File hashes

Hashes for datasetinsights-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9ac0c8dc4d47b3c76541f7d1552af4c147b91cdbb309bdfc65f7f71f0fd8472f
MD5 da87f6a411b993ed182f6a4d7036426d
BLAKE2b-256 dd58f5b3cf5f46d39ae3301844fe6a824c4a5e66dec496e236408de88489e37b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page