pdp · PyPI

Build fast data processing pipelines easily

These details have not been verified by PyPI

Project links

Project description

Why?

Many tasks in machine learning, deep learning and other fields require complex data processing that takes a lot of time. Ideally, this processing should run in parallel to the main process, preparing data for usage (by neural net, for instance). PDP provide simple interface to organize pipeline of data processing with simple blocks that satisfy most typical needs.

Use cases

Neural Net training, where you need a way to train net, load data from the disk and augment it. PDP allows user to do all these things at the same time without need to use threading module directly.

Examples

Are in repository in examples folder

Is it fast?

Speed and parallel execution is a top priority. Right now threads are used to exchange information between pipline stages, because it’s memory and CPU efficient to exchange data between threads and not processes. Python’s threads are flawed by GIL, but it doesn’t affect performance for IO-bound tasks and for numpy operations. Since all operations for data augmentations are likely to be done in numpy operations, performance will not be significantly affected by GIL.

Installation

pip install pdp

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Sep 20, 2018

0.2.1

Jan 11, 2018

0.2

Jan 11, 2018

0.1

Oct 11, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdp-0.3.0.tar.gz (5.2 kB view details)

Uploaded Sep 20, 2018 Source

File details

Details for the file pdp-0.3.0.tar.gz.

File metadata

Download URL: pdp-0.3.0.tar.gz
Upload date: Sep 20, 2018
Size: 5.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: Python-urllib/3.7

File hashes

Hashes for pdp-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`248b9e714efbaccd3d1d7f70a67a315aad8e54aecd9756320aadd1a951cf5171`
MD5	`63bcd72870c1619effd5d5f371a1dcd2`
BLAKE2b-256	`e4f95e4886980fd2a86013055142f9f3c9f94d3205495a0603a55d5eae32ac9d`

See more details on using hashes here.

pdp 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Why?

Use cases

Examples

Is it fast?

Installation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes