Skip to main content

Simple data processing tool.

Project description

Datapiper provides a flexible easy-to-use library for constructing and running simple data batch processing pipelines.

Give Datapiper your list of data processing callables and it will construct a runnable data pipeline for you.

If you instantiate the pipe with a (iterable) data source, you get a generator that reads from a source and outputs processed data for you:

>>> operations = [lambda context, data: data+1]
>>> datasource = [1,2,3]
>>> p = Piper(operations, source=datasource)
>>> print p
pipe: source > <lambda>
>>> [r for r in p]
[2,3,4]

If you instead instantiate it with a (callable) data sink, you get a coroutine that accepts data from a producer and delivers processed data to a sink:

>>> operations = [lambda context, data: data+1]
>>> results = []
>>> def datasink(data):
...    results.append(data)
>>> p = Piper(operations, sink=datasink)
>>> print p
pipe: <lambda> > sink
>>> for v in (1,2,3):
...    p.send(v)
...
>>> results
[2,3,4]

The context parameter passed to the data operations callables is meant for sharing state between them. It can be initialized to desired value(s) by passing it to the Piper class as a (optional) keyword argument. The context parameter can be anything; a dictionary is recommended.

Please see the tests for more examples.

History

0.1.0 (2017-10-31)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datapiper-0.1.0.zip (23.7 kB view details)

Uploaded Source

File details

Details for the file datapiper-0.1.0.zip.

File metadata

  • Download URL: datapiper-0.1.0.zip
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for datapiper-0.1.0.zip
Algorithm Hash digest
SHA256 362afe8010344ab845ab4cf12b2c615631464a3fa6c1055e07a1bac996be26f3
MD5 d1f362a6de740974bbc23ceeac70eccc
BLAKE2b-256 3c246251fe73e543b8d49deb3f3c750b417f84d087c7a6da5c2943aaf57aa3e8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page