Skip to main content

Keras-style data iterator for images contained in dataset files such as hdf5 or PIL readable files. Images can be contained in several files.

Project description

Dataset Iterator

This repo contains keras iterator classes for multi-channel (time-lapse) images contained in dataset files such as hdf5.

Dataset structure:

One dataset file can contain several sub-datasets (dataset_name0, dataset_name1, etc...), the iterator will iterate through all of them as if they were concatenated.

.
├── ...
├── dataset_name0                    
│   ├── channel0          
│   └── channel1   
│   └── ...
├── dataset_name1                    
│   ├── channel0          
│   └── channel1   
│   └── ...
└── ...

Each dataset contain channels (channel0, channel1 ...) that must have same shape. All datasets must have the same number of channels, and shape (except batch size) must be equal among datasets.

Groups

There can be more folder level, for instance to have train and test sets in the same file:

.
├── ...
├── experiment1                    
│   ├── train          
│   │   ├── raw
│   │   └── labels
│   └── test   
│       ├── raw
│       └── labels
├── experiment2                    
│   ├── train          
│   │   ├── raw
│   │   └── labels
│   └── test   
│       ├── raw
│       └── labels
└── ...
train_it = MultiChannelIterator(dataset_file_path = file_path, channel_keywords = ["/raw", "/labels"], group_keyword="train")
test_it = MultiChannelIterator(dataset_file_path = file_path, channel_keywords = ["/raw", "/labels"], group_keyword="test")

Image formats

  • Those iterators are using an object of class DatasetIO to access the data.
  • There is currently an implementation of DatasetIO for .h5 files (H5pyIO), as well as dataset composed of multiple images files supported by PILLOW (MultipleFileIO).
  • one can also concatenate datasets from different files:
    • if a dataset is split into several files that contain the same channels: use ConcatenateDatasetIO
    • if a dataset contains channels in different files, use: MultipleDatasetIO

Demo

See this notebook for a demo:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataset_iterator-0.5.8.tar.gz (69.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataset_iterator-0.5.8-py3-none-any.whl (78.4 kB view details)

Uploaded Python 3

File details

Details for the file dataset_iterator-0.5.8.tar.gz.

File metadata

  • Download URL: dataset_iterator-0.5.8.tar.gz
  • Upload date:
  • Size: 69.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataset_iterator-0.5.8.tar.gz
Algorithm Hash digest
SHA256 6a3f4cc4ae719e07b04b8ba9f87547d499992ae6cdc01919c5abd20f0ff6f24b
MD5 e799fb59ade5b00a114359a7eef265c8
BLAKE2b-256 dd3a01914d384ef9833d3f4148f42bf95bbb310918816954c16ee107a143370d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataset_iterator-0.5.8.tar.gz:

Publisher: publish-to-pypi.yml on jeanollion/dataset_iterator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataset_iterator-0.5.8-py3-none-any.whl.

File metadata

File hashes

Hashes for dataset_iterator-0.5.8-py3-none-any.whl
Algorithm Hash digest
SHA256 b6e883dc9e934435b422410cdf30952b488423dd494a0e791a7a8957ad6b879a
MD5 2ccfc187849f5c09ca3c82a281638e8e
BLAKE2b-256 3f080e4ac3cbab45ee36410a207a58648cea44f7a2e69f697c21611bf089d801

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataset_iterator-0.5.8-py3-none-any.whl:

Publisher: publish-to-pypi.yml on jeanollion/dataset_iterator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page