Skip to main content

Keras-style data iterator for images contained in dataset files such as hdf5 or PIL readable files. Images can be contained in several files.

Project description

Dataset Iterator

This repo contains keras iterator classes for multi-channel (time-lapse) images contained in dataset files such as hdf5.

Dataset structure:

One dataset file can contain several sub-datasets (dataset_name0, dataset_name1, etc...), the iterator will iterate through all of them as if they were concatenated.

.
├── ...
├── dataset_name0                    
│   ├── channel0          
│   └── channel1   
│   └── ...
├── dataset_name1                    
│   ├── channel0          
│   └── channel1   
│   └── ...
└── ...

Each dataset contain channels (channel0, channel1 ...) that must have same shape. All datasets must have the same number of channels, and shape (except batch size) must be equal among datasets.

Groups

There can be more folder level, for instance to have train and test sets in the same file:

.
├── ...
├── experiment1                    
│   ├── train          
│   │   ├── raw
│   │   └── labels
│   └── test   
│       ├── raw
│       └── labels
├── experiment2                    
│   ├── train          
│   │   ├── raw
│   │   └── labels
│   └── test   
│       ├── raw
│       └── labels
└── ...
train_it = MultiChannelIterator(dataset_file_path = file_path, channel_keywords = ["/raw", "/labels"], group_keyword="train")
test_it = MultiChannelIterator(dataset_file_path = file_path, channel_keywords = ["/raw", "/labels"], group_keyword="test")

Image formats

  • Those iterators are using an object of class DatasetIO to access the data.
  • There is currently an implementation of DatasetIO for .h5 files (H5pyIO), as well as dataset composed of multiple images files supported by PILLOW (MultipleFileIO).
  • one can also concatenate datasets from different files:
    • if a dataset is split into several files that contain the same channels: use ConcatenateDatasetIO
    • if a dataset contains channels in different files, use: MultipleDatasetIO

Demo

See this notebook for a demo:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataset_iterator-0.5.7.tar.gz (68.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataset_iterator-0.5.7-py3-none-any.whl (77.6 kB view details)

Uploaded Python 3

File details

Details for the file dataset_iterator-0.5.7.tar.gz.

File metadata

  • Download URL: dataset_iterator-0.5.7.tar.gz
  • Upload date:
  • Size: 68.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataset_iterator-0.5.7.tar.gz
Algorithm Hash digest
SHA256 fb0cd8f8058ad37078fcbe7eba1740f5f8e50b7400602c51fe700b15f67b4a32
MD5 3e0796b7ef1a83ec466dba00a65cb568
BLAKE2b-256 27eeed1505ddfba9bbe3512986038fcbe2dce941dc36956af9339c9ef7bbfd2b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataset_iterator-0.5.7.tar.gz:

Publisher: publish-to-pypi.yml on jeanollion/dataset_iterator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataset_iterator-0.5.7-py3-none-any.whl.

File metadata

File hashes

Hashes for dataset_iterator-0.5.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f84871a24ffbf1abc731ea5f47aafff11bb00232a87150a6e51849b7edb52c5a
MD5 206882dcda55e395de9c7a5de9034f04
BLAKE2b-256 261c3fbfd9536760a29a3d6a0e361c4c1e49e440d4084cf0e610f0106e717af9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataset_iterator-0.5.7-py3-none-any.whl:

Publisher: publish-to-pypi.yml on jeanollion/dataset_iterator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page