!Alpha Version! - This repository contains code to make datasets stored on the corpora network drive of the chair compatible with the [tensorflow dataset api](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)
Project description
Description
This repository contains code to make datasets stored on th corpora network drive of the chair compatible with the tensorflow dataset api .
Currently available Datasets
Dataset | Status | Url |
---|---|---|
audioset | ❌ | https://research.google.com/audioset/ |
ckplus | ✅ | http://www.iainm.com/publications/Lucey2010-The-Extended/paper.pdf |
faces | ✅ | https://faces.mpdl.mpg.de/imeji/ |
is2021_ess | ❌ | - |
librispeech | ❌ | https://www.openslr.org/12 |
nova_dynamic | ✅ | https://github.com/hcmlab/nova |
Example Usage
import os
import tensorflow as tf
import tensorflow_datasets as tfds
import hcai_datasets
from matplotlib import pyplot as plt
# Preprocessing function
def preprocess(x, y):
img = x.numpy()
return img, y
# Creating a dataset
ds, ds_info = tfds.load(
'hcai_example_dataset',
split='train',
with_info=True,
as_supervised=True,
builder_kwargs={'dataset_dir': os.path.join('path', 'to', 'directory')}
)
# Input output mapping
ds = ds.map(lambda x, y: (tf.py_function(func=preprocess, inp=[x, y], Tout=[tf.float32, tf.int64])))
# Manually iterate over dataset
img, label = next(ds.as_numpy_iterator())
# Visualize
plt.imshow(img / 255.)
plt.show()
Example Usage Nova Dynamic Data
import os
import hcai_datasets
import tensorflow_datasets as tfds
## Load Data
ds, ds_info = tfds.load(
'hcai_nova_dynamic',
split='dynamic_split',
with_info=True,
as_supervised=True,
builder_kwargs={
# Database Config
'db_config_path': 'db.cfg',
'db_config_dict': None,
# Dataset Config
'dataset': '<dataset_name>',
'nova_data_dir': os.path.join('C:', 'Nova', 'Data'),
'sessions': ['<session_name>'],
'roles': ['<role_one>', '<role_two>'],
'schemes': ['<label_scheme_one'],
'annotator': '<annotator_id>',
'data_streams': ['<stream_name>'],
# Sample Config
'frame_step': 1,
'left_context': 0,
'right_context': 0,
'start': None,
'end': None,
#'flatten_samples': False,
'supervised_keys': ['<role_one>.<stream_name>', '<scheme_two>'],
# Additional Config
'clear_cache' : True
}
)
data_it = ds.as_numpy_iterator()
ex_data = next(data_it)
print(ex_data)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hcai-datasets-0.0.5.tar.gz
(19.0 kB
view details)
Built Distribution
File details
Details for the file hcai-datasets-0.0.5.tar.gz
.
File metadata
- Download URL: hcai-datasets-0.0.5.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c963beafa5370d7ebb2d3762b1efb61719ad135f4ee1260372ad281a3b57490 |
|
MD5 | 05f6704908e48495af7f7ef1165ff672 |
|
BLAKE2b-256 | 1cd2de8983b074f1cb10fdb4c567f32cac8e10634c1f13e2dc77f0cdb18cedd4 |
File details
Details for the file hcai_datasets-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: hcai_datasets-0.0.5-py3-none-any.whl
- Upload date:
- Size: 30.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0bbd32b4571f90d009f287c2dc0cd07da00a48176cd3dcb0bf053e31fb304520 |
|
MD5 | d82ad7ec4f8048039e4904e08df44c5c |
|
BLAKE2b-256 | f1627d669d620200d93a13b244230545fb5ca5a363ab45ea1f7fac91357f2254 |