Skip to main content

A simple library to simplify data handling in deep learning environments. With an API non dissimilar to PyTorch DataLoader

Project description

DataManager

A simple library to simplify data handling in deep learning environments. With an API non dissimilar to PyTorch DataLoader

Usage

    from dataManager import Manager as DataManager

    dataFunction = lambda x: (reduceImageSize(inputImages,x), inputLabels[x], ...)
    # Initialize Data Providers
    dataManger = DataManger(
        data = dataFunction,
        bz = 32,  
        stochasticSampling = True,
        indexingShape = [dataInput.shape[0]] 
    )

    # provide the data for the tenth batch
    i = 10
    dataManger(i, stochastic = False).shape
    # > (32, ...)

API

__init__(data, indexingShape = None, bz = 32, stochasticSampling = True)

  • data: numpy.array or function: Data to be used during training. If a numpy array is provided the data will be wrapped in a function. The input function should only take one argument which are the indexes of the elements for the data batch.

  • indexingShape: array: Array of size 1 or 2. Used to provide the data at each batch. If the array had dim 1, then a list of indexes will be provided. If stochasticSampling is set to false then the indexes will be ordered from 0 to (indexingShape - indexingShape%bz), else if the array is dim 2 the indexes will be provided from left to right and from top to bottom of the 2d matrix, in this case we also only provide as many indexes as is possible to provide with full batches.

  • stochasticSampling: bool: if true, the indexes provided are sampled in a totally random way with equal probability for each element. For each batch all indexes will be different, but uniqueness of indexes is not guaranteed though batches.

__call__(i, stochastic = None)

  • i: integer: index of the current batch, starting from 0
  • stochastic: bool: if stochasticSampling is to be used or not

Notes:

StochasticSampling overwrites ordered indexing.

Undocumented Functions:

StochasticSampling overwrites ordered indexing.

  • self.getBatch(self, step, bz): return an ordered batch of indexes at a given step
  • self.getBatch(self, step, bz): return an ordered batch of indexes at a given step
  • self.getStochasticBatch(self, shape, bz): return a randomly picked set of elements with equal probability. Shape is the specific shape defaults to indexingShape if nothing given.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataManager-0.2.2.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

dataManager-0.2.2-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file dataManager-0.2.2.tar.gz.

File metadata

  • Download URL: dataManager-0.2.2.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.27.0 CPython/3.6.7

File hashes

Hashes for dataManager-0.2.2.tar.gz
Algorithm Hash digest
SHA256 47e0cf8d47e46abdabe0b2b9caebbd38de33880ce63a00fb8cf726713794ca0b
MD5 f3c6c4ea0db044c3e891e9a01e6256a2
BLAKE2b-256 4aa5514139068a6f23535ff3c3dc08ab2b68b0468e2130940e9b0b392995ba49

See more details on using hashes here.

File details

Details for the file dataManager-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: dataManager-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.27.0 CPython/3.6.7

File hashes

Hashes for dataManager-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6f93146a5e7046f578ec8c21636a09f40f5df804d5746c42f8c7c1e9af512a33
MD5 c89b86cbcd46b1e523c84adf34aad8fa
BLAKE2b-256 f82f7e0e2df90f928a033cb090499c44b9a6fd11017058f7bdde1ac4f3ed5ad0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page