Extract Transform Load (ETL) toolkit for python
Project description
- Toolkit for doing data integration related work, using connected
transformations. Unlike java based tools like talend or pentaho data-integration, this is a DIY framework, and if you’re looking for a WYSIWIG ETL engine, you should probably go back to the previously cited ones.
Not so relevant example:
>>> from rdc.etl.harness.threaded import ThreadedHarness as Harness >>> harness = Harness() >>> from rdc.etl.transform.extract import Extract >>> extract = Extract(stream_data=({'foo': 'bar'}, {'foo': 'baz'})) >>> from rdc.etl.transform.simple import SimpleTransform >>> transform = SimpleTransform() >>> transform.add('foo').filter('upper') >>> from rdc.etl.transform.util import Log >>> load = Log() >>> harness.chain_add(extract, transform, load) >>> harness()
This is a work in progress, although it it used for a few different production systems, it may or may not fit your need, and you should expect to have to dive into the code for now, as neither documentation or tests are there to help.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rdc.etl-1.0.0a1.tar.gz
(23.0 kB
view hashes)