Building HDF datasets for machine learning.
Project description
bci-dataset
Python library for organizing multiple EEG datasets using HDF.
Support EEGLAB Data!
*For do deep learning, this library was created as a tool to combine datasets for the major BCI paradigms.
Installation
pip install bci-dataset
How to Use
Add EEG Data
Supported Formats
- EEGLAB(.set)
- Epoching (epoch splitting) on EEGLAB is required.
- numpy(ndarray)
Commonality
import bci_dataset
fpath = "./dataset.hdf"
fs = 500 # sampling rate
updater = DatasetUpdater(fpath,fs=fs)
updater.remove_hdf() # delete hdf file that already exist
Add EEGLAB Data
import numpy as np
labels = ["left","right"]
eeglab_list = ["./sample.set"] # path list of eeglab files
# add eeglab(.set) files
for fp in eeglab_list:
updater.add_eeglab(fp,labels)
Add NumPy Data
#dummy
dummy_data = np.ones((12,6000)) # channel × signal
dummy_indexes = [0,1000,2000,3000,4000,5000] #Index of trial start
dummy_labels = ["left","right"]*3 #Label of trials
dummy_size = 500 #Size of 1 trial
updater.add_numpy(dummy_data,dummy_indexes,dummy_labels,dummy_size)
Apply Preprocessing
If the "preprocess" method is executed again with the same group name, the already created group with the specified name is deleted once before preprocessing.
"""
preprocessing example
bx : ch × samples
"""
def prepro_func(bx:np.ndarray):
x = bx[12:15,:]
return StandardScaler().fit_transform(x.T).T
updater.preprocess("custom",prepro_func)
Contents of HDF
Note that "dataset" in the figure below refers to the HDF dataset (class).
hdf file
├ origin : group / raw data
│ ├ 1 : dataset
│ ├ 2 : dataset
│ ├ 3 : dataset
│ ├ 4 : dataset
│ ├ 5 : dataset
│ └ …
└ prepro : group / data after preprocessing
├ custom : group / "custom" is any group name
│ ├ 1 : dataset
│ ├ 2 : dataset
│ ├ 3 : dataset
│ ├ 4 : dataset
│ ├ 5 : dataset
│ └ …
└ custom2 : group
└ ...omit (1,2,3,4,…)
- Check the contents with software such as HDFView.
- Use "h5py" or similar to read the HDF file.
import h5py with h5py.File(fpath) as h5: fs = h5["prepro/custom"].attrs["fs"] dataset_size = h5["prepro/custom"].attrs["count"] dataset79 = h5["prepro/custom/79"][()] #ch × samples dataset79_label = h5["prepro/custom/79"].attrs["label"]
Merge Dataset
In order to merge, "dataset_name" must be set.
If the order of channels is different for each dataset, the order can be aligned by specifying ch_indexes.
Source's preprocessing group is not inherited. In other words, preprocess() must be executed after the merge.
Example: Merge source1 and source2 datasets
target = DatasetUpdater("new_dataset.h5",fs=fs)
target.remove_hdf() # reset hdf
s1 = DatasetUpdater("source1.h5",fs=fs,dataset_name="source1")
s2 = DatasetUpdater("source2.h5",fs=fs,dataset_name="source2")
s1_ch_indexes = [1,60,10,5]# channel indexes to use
target.merge_hdf(s1,ch_indexes=s1_ch_indexes)
target.merge_hdf(s2)
Pull requests / Issues
If you need anything...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bci-dataset-1.0.0.tar.gz
.
File metadata
- Download URL: bci-dataset-1.0.0.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae2bb40ddad32bd50d086fe59dcbb79b0f5ae7236098a71a9b4df3f8e17d2bac |
|
MD5 | cfa5cfc7513e9709fa8cb9561dae8b86 |
|
BLAKE2b-256 | 874609f647c7cfb4f2edaba7cc27f579367957b6a741003a740cb6bd7bb65504 |
File details
Details for the file bci_dataset-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: bci_dataset-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b6d91987022a9c17d0bf82c72ea4a296fff4bb6d7a991a1907ea12644b95464 |
|
MD5 | f9505a4577c2ab39a8c569bcd125bc83 |
|
BLAKE2b-256 | 923bf090a01733627b99edd22119b66cb3b9a5894ee7107ddea51f206beaff70 |