Skip to main content

Extract Transform Load (ETL) toolkit for python

Project description

Toolkit for doing data integration related work, using connected

transformations. Unlike java based tools like talend or pentaho data-integration, this is a DIY framework, and if you’re looking for a WYSIWIG ETL engine, you should probably go back to the previously cited ones.

Create a harness.

>>> from rdc.etl.harness.threaded import ThreadedHarness as Harness
>>> harness = Harness()

Create some data transformations.

>>> from rdc.etl.transform.extract import Extract
>>> extract = Extract(stream_data=({'foo': 'bar'}, {'foo': 'baz'}))
>>> from rdc.etl.transform.simple import SimpleTransform
>>> transform = SimpleTransform()
>>> transform.add('foo').filter('upper')
>>> from rdc.etl.transform.util import Log
>>> load = Log()

Tie everything together.

>>> harness.add_chain(extract, transform, load)

Run.

>>> harness()

This is a work in progress, the 1.0 API may change until 1.0 is released.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdc.etl-1.0.0a2.tar.gz (26.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page