Extract Transform Load (ETL) toolkit for python
Project description
- Toolkit for doing data integration related work, using connected
transformations. Unlike java based tools like talend or pentaho data-integration, this is a DIY framework, and if you’re looking for a WYSIWIG ETL engine, you should probably go back to the previously cited ones.
Create a harness.
>>> from rdc.etl.harness.threaded import ThreadedHarness as Harness >>> harness = Harness()
Create some data transformations.
>>> from rdc.etl.transform.extract import Extract >>> extract = Extract(stream_data=({'foo': 'bar'}, {'foo': 'baz'}))
>>> from rdc.etl.transform.simple import SimpleTransform >>> transform = SimpleTransform() >>> transform.add('foo').filter('upper')
>>> from rdc.etl.transform.util import Log >>> load = Log()
Tie everything together.
>>> harness.add_chain(extract, transform, load)
Run.
>>> harness()
This is a work in progress, the 1.0 API may change until 1.0 is released.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rdc.etl-1.0.0a3.tar.gz
(23.3 kB
view hashes)