Skip to main content

!Alpha Version! - This repository contains code to make datasets stored on the corpora network drive of the chair compatible with the [tensorflow dataset api](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)

Project description

Description

This repository contains code to make datasets stored on th corpora network drive of the chair compatible with the tensorflow dataset api .

Currently available Datasets

Dataset Status Url
audioset https://research.google.com/audioset/
ckplus http://www.iainm.com/publications/Lucey2010-The-Extended/paper.pdf
faces https://faces.mpdl.mpg.de/imeji/
is2021_ess -
librispeech https://www.openslr.org/12
nova_dynamic https://github.com/hcmlab/nova

Example Usage

import os
import tensorflow as tf
import tensorflow_datasets as tfds
import hcai_datasets
from matplotlib import pyplot as plt

# Preprocessing function
def preprocess(x, y):
  img = x.numpy()
  return img, y

# Creating a dataset
ds, ds_info = tfds.load(
  'hcai_example_dataset',
  split='train',
  with_info=True,
  as_supervised=True,
  builder_kwargs={'dataset_dir': os.path.join('path', 'to', 'directory')}
)

# Input output mapping
ds = ds.map(lambda x, y: (tf.py_function(func=preprocess, inp=[x, y], Tout=[tf.float32, tf.int64])))

# Manually iterate over dataset
img, label = next(ds.as_numpy_iterator())

# Visualize
plt.imshow(img / 255.)
plt.show()

Example Usage Nova Dynamic Data

import os
import hcai_datasets
import tensorflow_datasets as tfds

## Load Data
ds, ds_info = tfds.load(
  'hcai_nova_dynamic',
  split='dynamic_split',
  with_info=True,
  as_supervised=True,
  builder_kwargs={
    # Database Config
    'db_config_path': 'db.cfg',
    'db_config_dict': None,

    # Dataset Config
    'dataset': '<dataset_name>',
    'nova_data_dir': os.path.join('C:', 'Nova', 'Data'),
    'sessions': ['<session_name>'],
    'roles': ['<role_one>', '<role_two>'],
    'schemes': ['<label_scheme_one'],
    'annotator': '<annotator_id>',
    'data_streams': ['<stream_name>'],

    # Sample Config
    'frame_step': 1,
    'left_context': 0,
    'right_context': 0,
    'start': None,
    'end': None,
    #'flatten_samples': False, 
    'supervised_keys': ['<role_one>.<stream_name>', '<scheme_two>'],

    # Additional Config
    'clear_cache' : True
  }
)

data_it = ds.as_numpy_iterator()
ex_data = next(data_it)
print(ex_data)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hcai-datasets-0.0.5.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

hcai_datasets-0.0.5-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file hcai-datasets-0.0.5.tar.gz.

File metadata

  • Download URL: hcai-datasets-0.0.5.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.7

File hashes

Hashes for hcai-datasets-0.0.5.tar.gz
Algorithm Hash digest
SHA256 8c963beafa5370d7ebb2d3762b1efb61719ad135f4ee1260372ad281a3b57490
MD5 05f6704908e48495af7f7ef1165ff672
BLAKE2b-256 1cd2de8983b074f1cb10fdb4c567f32cac8e10634c1f13e2dc77f0cdb18cedd4

See more details on using hashes here.

File details

Details for the file hcai_datasets-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: hcai_datasets-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.7

File hashes

Hashes for hcai_datasets-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0bbd32b4571f90d009f287c2dc0cd07da00a48176cd3dcb0bf053e31fb304520
MD5 d82ad7ec4f8048039e4904e08df44c5c
BLAKE2b-256 f1627d669d620200d93a13b244230545fb5ca5a363ab45ea1f7fac91357f2254

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page