Skip to main content

!Alpha Version! - This repository contains code to make datasets stored on the corpora network drive of the chair compatible with the [tensorflow dataset api](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)

Project description

Description

This repository contains code to make datasets stored on th corpora network drive of the chair compatible with the tensorflow dataset api .

Currently available Datasets

Dataset Status Url
ckplus http://www.iainm.com/publications/Lucey2010-The-Extended/paper.pdf
affectnet http://mohammadmahoor.com/affectnet/
faces https://faces.mpdl.mpg.de/imeji/
nova_dynamic https://github.com/hcmlab/nova
audioset https://research.google.com/audioset/
is2021_ess -
librispeech https://www.openslr.org/12

Example Usage

import os
import tensorflow as tf
import tensorflow_datasets as tfds
import hcai_datasets
from matplotlib import pyplot as plt

# Preprocessing function
def preprocess(x, y):
  img = x.numpy()
  return img, y

# Creating a dataset
ds, ds_info = tfds.load(
  'hcai_example_dataset',
  split='train',
  with_info=True,
  as_supervised=True,
  builder_kwargs={'dataset_dir': os.path.join('path', 'to', 'directory')}
)

# Input output mapping
ds = ds.map(lambda x, y: (tf.py_function(func=preprocess, inp=[x, y], Tout=[tf.float32, tf.int64])))

# Manually iterate over dataset
img, label = next(ds.as_numpy_iterator())

# Visualize
plt.imshow(img / 255.)
plt.show()

Example Usage Nova Dynamic Data

import os
import hcai_datasets
import tensorflow_datasets as tfds
from sklearn.svm import LinearSVC
import numpy as np
from sklearn.calibration import CalibratedClassifierCV
import warnings
warnings.simplefilter("ignore")

## Load Data
ds, ds_info = tfds.load(
  'hcai_nova_dynamic',
  split='dynamic_split',
  with_info=True,
  as_supervised=True,
  data_dir='.',
  read_config=tfds.ReadConfig(
    shuffle_seed=1337
  ),
  builder_kwargs={
    # Database Config
    'db_config_path': 'nova_db.cfg',
    'db_config_dict': None,

    # Dataset Config
    'dataset': '<dataset_name>',
    'nova_data_dir': os.path.join('C:', 'Nova', 'Data'),
    'sessions': ['<session_name>'],
    'roles': ['<role_one>', '<role_two>'],
    'schemes': ['<label_scheme_one'],
    'annotator': '<annotator_id>',
    'data_streams': ['<stream_name>'],

    # Sample Config
    'frame_step': 1,
    'left_context': 0,
    'right_context': 0,
    'start': None,
    'end': None,
    'flatten_samples': False, 
    'supervised_keys': ['<role_one>.<stream_name>', '<scheme_two>'],

    # Additional Config
    'clear_cache' : True
  }
)

data_it = ds.as_numpy_iterator()
data_list = list(data_it)
data_list.sort(key=lambda x: int(x['frame'].decode('utf-8').split('_')[0]))
x = [v['<stream_name>'] for v in data_list]
y = [v['<scheme_two'] for v in data_list]

x_np = np.ma.concatenate( x, axis=0 )
y_np = np.array( y )

linear_svc = LinearSVC()
model = CalibratedClassifierCV(linear_svc,
                               method='sigmoid',
                               cv=3)
print('train_x shape: {} | train_x[0] shape: {}'.format(x_np.shape, x_np[0].shape))
model.fit(x_np, y_np)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hcai-datasets-0.0.10.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

hcai_datasets-0.0.10-py3-none-any.whl (34.8 kB view details)

Uploaded Python 3

File details

Details for the file hcai-datasets-0.0.10.tar.gz.

File metadata

  • Download URL: hcai-datasets-0.0.10.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.6

File hashes

Hashes for hcai-datasets-0.0.10.tar.gz
Algorithm Hash digest
SHA256 85f9c2cc7321da0856520b2ed116ef68b0c19ae2a04c0b8cf1dc9f4b0d83cc06
MD5 703bf23a2bfebcd25383ca95b0eeaadc
BLAKE2b-256 624c04107d24b89acfda1e885c78587cf2a6dd2652f8462543a62b560303358f

See more details on using hashes here.

File details

Details for the file hcai_datasets-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: hcai_datasets-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 34.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.6

File hashes

Hashes for hcai_datasets-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 f58eaeafc89bda206072a53066a53116e7f0d59d6bc931de8067d45ac9d2a8a4
MD5 6542c63bf0bca5bf55f8993d59700f2d
BLAKE2b-256 42f8f8cd23726caa2f41c49729ba9946484f7b85965dd72f2818851a13744515

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page