!Alpha Version! - This repository contains code to make datasets stored on the corpora network drive of the chair compatible with the [tensorflow dataset api](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)
Project description
Description
This repository contains code to make datasets stored on th corpora network drive of the chair compatible with the tensorflow dataset api .
Currently available Datasets
Dataset | Status | Url |
---|---|---|
ckplus | ✅ | http://www.iainm.com/publications/Lucey2010-The-Extended/paper.pdf |
affectnet | ✅ | http://mohammadmahoor.com/affectnet/ |
faces | ✅ | https://faces.mpdl.mpg.de/imeji/ |
nova_dynamic | ✅ | https://github.com/hcmlab/nova |
audioset | ❌ | https://research.google.com/audioset/ |
is2021_ess | ❌ | - |
librispeech | ❌ | https://www.openslr.org/12 |
Example Usage
import os
import tensorflow as tf
import tensorflow_datasets as tfds
import hcai_datasets
from matplotlib import pyplot as plt
# Preprocessing function
def preprocess(x, y):
img = x.numpy()
return img, y
# Creating a dataset
ds, ds_info = tfds.load(
'hcai_example_dataset',
split='train',
with_info=True,
as_supervised=True,
builder_kwargs={'dataset_dir': os.path.join('path', 'to', 'directory')}
)
# Input output mapping
ds = ds.map(lambda x, y: (tf.py_function(func=preprocess, inp=[x, y], Tout=[tf.float32, tf.int64])))
# Manually iterate over dataset
img, label = next(ds.as_numpy_iterator())
# Visualize
plt.imshow(img / 255.)
plt.show()
Example Usage Nova Dynamic Data
import os
import hcai_datasets
import tensorflow_datasets as tfds
from sklearn.svm import LinearSVC
import numpy as np
from sklearn.calibration import CalibratedClassifierCV
import warnings
warnings.simplefilter("ignore")
## Load Data
ds, ds_info = tfds.load(
'hcai_nova_dynamic',
split='dynamic_split',
with_info=True,
as_supervised=True,
data_dir='.',
read_config=tfds.ReadConfig(
shuffle_seed=1337
),
builder_kwargs={
# Database Config
'db_config_path': 'nova_db.cfg',
'db_config_dict': None,
# Dataset Config
'dataset': '<dataset_name>',
'nova_data_dir': os.path.join('C:', 'Nova', 'Data'),
'sessions': ['<session_name>'],
'roles': ['<role_one>', '<role_two>'],
'schemes': ['<label_scheme_one'],
'annotator': '<annotator_id>',
'data_streams': ['<stream_name>'],
# Sample Config
'frame_step': 1,
'left_context': 0,
'right_context': 0,
'start': None,
'end': None,
'flatten_samples': False,
'supervised_keys': ['<role_one>.<stream_name>', '<scheme_two>'],
# Additional Config
'clear_cache' : True
}
)
data_it = ds.as_numpy_iterator()
data_list = list(data_it)
data_list.sort(key=lambda x: int(x['frame'].decode('utf-8').split('_')[0]))
x = [v['<stream_name>'] for v in data_list]
y = [v['<scheme_two'] for v in data_list]
x_np = np.ma.concatenate( x, axis=0 )
y_np = np.array( y )
linear_svc = LinearSVC()
model = CalibratedClassifierCV(linear_svc,
method='sigmoid',
cv=3)
print('train_x shape: {} | train_x[0] shape: {}'.format(x_np.shape, x_np[0].shape))
model.fit(x_np, y_np)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hcai-datasets-0.0.10.tar.gz
(25.5 kB
view details)
Built Distribution
File details
Details for the file hcai-datasets-0.0.10.tar.gz
.
File metadata
- Download URL: hcai-datasets-0.0.10.tar.gz
- Upload date:
- Size: 25.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85f9c2cc7321da0856520b2ed116ef68b0c19ae2a04c0b8cf1dc9f4b0d83cc06 |
|
MD5 | 703bf23a2bfebcd25383ca95b0eeaadc |
|
BLAKE2b-256 | 624c04107d24b89acfda1e885c78587cf2a6dd2652f8462543a62b560303358f |
File details
Details for the file hcai_datasets-0.0.10-py3-none-any.whl
.
File metadata
- Download URL: hcai_datasets-0.0.10-py3-none-any.whl
- Upload date:
- Size: 34.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f58eaeafc89bda206072a53066a53116e7f0d59d6bc931de8067d45ac9d2a8a4 |
|
MD5 | 6542c63bf0bca5bf55f8993d59700f2d |
|
BLAKE2b-256 | 42f8f8cd23726caa2f41c49729ba9946484f7b85965dd72f2818851a13744515 |