Skip to main content

Keras-style data iterator for images contained in dataset files such as hdf5 or PIL readable files. Images can be contained in several files.

Project description

Dataset Iterator

This repo contains keras iterator classes for multi-channel (time-lapse) images contained in dataset files such as hdf5.

Dataset structure:

One dataset file can contain several sub-datasets (dataset_name0, dataset_name1, etc...), the iterator will iterate through all of them as if they were concatenated.

.
├── ...
├── dataset_name0                    
│   ├── channel0          
│   └── channel1   
│   └── ...
├── dataset_name1                    
│   ├── channel0          
│   └── channel1   
│   └── ...
└── ...

Each dataset contain channels (channel0, channel1 ...) that must have same shape. All datasets must have the same number of channels, and shape (except batch size) must be equal among datasets.

Groups

There can be more folder level, for instance to have train and test sets in the same file:

.
├── ...
├── experiment1                    
│   ├── train          
│   │   ├── raw
│   │   └── labels
│   └── test   
│       ├── raw
│       └── labels
├── experiment2                    
│   ├── train          
│   │   ├── raw
│   │   └── labels
│   └── test   
│       ├── raw
│       └── labels
└── ...
train_it = MultiChannelIterator(dataset_file_path = file_path, channel_keywords = ["/raw", "/labels"], group_keyword="train")
test_it = MultiChannelIterator(dataset_file_path = file_path, channel_keywords = ["/raw", "/labels"], group_keyword="test")

Image formats

  • Those iterators are using an object of class DatasetIO to access the data.
  • There is currently an implementation of DatasetIO for .h5 files (H5pyIO), as well as dataset composed of multiple images files supported by PILLOW (MultipleFileIO).
  • one can also concatenate datasets from different files:
    • if a dataset is split into several files that contain the same channels: use ConcatenateDatasetIO
    • if a dataset contains channels in different files, use: MultipleDatasetIO

Demo

See this notebook for a demo:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataset_iterator-0.5.0.tar.gz (59.6 kB view details)

Uploaded Source

Built Distribution

dataset_iterator-0.5.0-py3-none-any.whl (67.9 kB view details)

Uploaded Python 3

File details

Details for the file dataset_iterator-0.5.0.tar.gz.

File metadata

  • Download URL: dataset_iterator-0.5.0.tar.gz
  • Upload date:
  • Size: 59.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for dataset_iterator-0.5.0.tar.gz
Algorithm Hash digest
SHA256 a25dd7eb9d6627926da0c77a73729ee458077e0534a8c3fbad7d9de4ff7382a6
MD5 e7ec0597e992aca1a4ac68faa8a8bb7f
BLAKE2b-256 c1580192c24bbc1dbccc21185ad7337a338cc32c0ec033921ac8003f2bf01e91

See more details on using hashes here.

File details

Details for the file dataset_iterator-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dataset_iterator-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa99e260dad154fc20b6b6817b911574997dc01b195c726d85230cd402d5ecc1
MD5 c1244f4a5df568fe4027b7e639dcf0e7
BLAKE2b-256 c3cba8f2839a28d3e03e0a6dd423deef87614493d267ed82b1c81b21f644f843

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page