A simple library to simplify data handling in deep learning environments. With an API non dissimilar to PyTorch DataLoader
Project description
DataManager
A simple library to simplify data handling in deep learning environments. With an API non dissimilar to PyTorch DataLoader
Usage
from dataManager import Manager as DataManager
dataFunction = lambda x: (reduceImageSize(inputImages,x), inputLabels[x], ...)
# Initialize Data Providers
dataManger = DataManger(
data = dataFunction,
bz = 32,
stochasticSampling = True,
indexingShape = [dataInput.shape[0]]
)
# provide the data for the tenth batch
i = 10
dataManger(i, stochastic = False).shape
# > (32, ...)
API
__init__(data, indexingShape = None, bz = 32, stochasticSampling = True)
-
data: numpy.array or function
: Data to be used during training. If a numpy array is provided the data will be wrapped in a function. The input function should only take one argument which are the indexes of the elements for the data batch. -
indexingShape: array
: Array of size 1 or 2. Used to provide the data at each batch. If the array had dim 1, then a list of indexes will be provided. IfstochasticSampling
is set to false then the indexes will be ordered from 0 to (indexingShape - indexingShape%bz), else if the array is dim 2 the indexes will be provided from left to right and from top to bottom of the 2d matrix, in this case we also only provide as many indexes as is possible to provide with full batches. -
stochasticSampling: bool
: if true, the indexes provided are sampled in a totally random way with equal probability for each element. For each batch all indexes will be different, but uniqueness of indexes is not guaranteed though batches.
__call__(i, stochastic = None)
i: integer
: index of the current batch, starting from 0stochastic: bool
: if stochasticSampling is to be used or not
Notes:
StochasticSampling overwrites ordered indexing.
Undocumented Functions:
StochasticSampling overwrites ordered indexing.
- self.getBatch(self, step, bz): return an ordered batch of indexes at a given step
- self.getBatch(self, step, bz): return an ordered batch of indexes at a given step
- self.getStochasticBatch(self, shape, bz): return a randomly picked set of elements with equal probability. Shape is the specific shape defaults to indexingShape if nothing given.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dataManager-0.2.2.tar.gz
.
File metadata
- Download URL: dataManager-0.2.2.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.27.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47e0cf8d47e46abdabe0b2b9caebbd38de33880ce63a00fb8cf726713794ca0b |
|
MD5 | f3c6c4ea0db044c3e891e9a01e6256a2 |
|
BLAKE2b-256 | 4aa5514139068a6f23535ff3c3dc08ab2b68b0468e2130940e9b0b392995ba49 |
File details
Details for the file dataManager-0.2.2-py3-none-any.whl
.
File metadata
- Download URL: dataManager-0.2.2-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.27.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f93146a5e7046f578ec8c21636a09f40f5df804d5746c42f8c7c1e9af512a33 |
|
MD5 | c89b86cbcd46b1e523c84adf34aad8fa |
|
BLAKE2b-256 | f82f7e0e2df90f928a033cb090499c44b9a6fd11017058f7bdde1ac4f3ed5ad0 |