A microframework for simple ETL solutions
Project description
[![Documentation Status](https://readthedocs.org/projects/bert-etl/badge/?version=latest)](https://bert-etl.readthedocs.io/en/latest/?badge=latest)
# Bert A microframework for simple ETL solutions.
## Architecture
At its core, bert-etl uses Dynamodb Streams to communicate between lambda functions. bert-etl.yaml provides control on how the initial lambda function is called, either by periodic events, sns topics, or s3 bucket (planned)events. Passing an event to bert-etl is straight forward from zappa or a generic AWS lambda function you’ve hooked up to API Gateway.
At this moment in time, there are no plans to attach API Gateway to bert-etl.yaml because there is already great software(like zappa) that does this.
## Begin with
Lets begin with an example of loading data from a file-server and than loading it into numpy arrays
` $ virtualenv -p $(which python3) env $ source env/bin/activate $ pip install bert-etl $ pip install librosa # for demo project $ docker run -p 6379:6379 -d redis # bert-etl runs on redis to share data across CPUs $ bert-runner.py -n demo $ PYTHONPATH='.' bert-runner.py -m demo -j sync_sounds -f `
## Release Notes
### 0.3.0
Added Error Management. When an error occurs, bert-runner will log the error and re-run the job. If the same error happens often enough, the job will be aborted
### 0.2.1
Added Release Notes
### 0.2.0
Added Redis Service auto run. Using docker, redis will be pulled and started in the background
Added Redis Service channels, sometimes you’ll want to run to etl-jobs on the same machine
## Fund Bounty Target Upgrades
Bert provides a boiler plate framework that’ll allow one to write concurrent ETL code using Pythons’ microprocessing module. One function starts the process, piping data into a Redis backend that’ll then be consumed by the next function. The queues are respectfully named for the scope of the function: Work(start) and Done(end) queue. Please consider contributing to Bert Bounty Targets to improve this documentation
https://www.patreon.com/jbcurtin
## Roadmap
Create configuration file, bert-etl.yaml
Support conda venv
Support pyenv venv
Support dynamodb flush
Support multipule invocations per AWS account
Support undeploy AWS Lambda
Support Bottle functions in AWS Lambda
## Tutorial Roadmap
Introduce Bert API
Explain bert.binding
Explain comm_binder
Explain work_queue
Explain done_queue
Explain ologger
Explain DEBUG and how turning it off allows for x-concurrent processes
Show an example on how to load timeseries data, calcualte the mean, and display the final output of the mean
Expand the example to show how to scale the application implicitly
Show how to run locally using Redis
Show how to run locally without Redis, using Dynamodb instead
Show how to run remotly using AWSLambda and Dynamodb
Talk about dynamodb and eventual consistency
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bert-etl-0.4.35.tar.gz
.
File metadata
- Download URL: bert-etl-0.4.35.tar.gz
- Upload date:
- Size: 26.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c5bf9f77a79e6ad7d2e7ef5b15ab2891f2ad67c7eafc7f857d0e5f73d631ffe |
|
MD5 | 03f7976f77869bae14628cb7de60d013 |
|
BLAKE2b-256 | 710cf59c64ddc7e3aa6f3b501a95bbfffea03a450995dc6571ffc310f1bdfe2f |
File details
Details for the file bert_etl-0.4.35-py2.py3-none-any.whl
.
File metadata
- Download URL: bert_etl-0.4.35-py2.py3-none-any.whl
- Upload date:
- Size: 40.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec5aacab0c14cda5e2619d19e11549795003cb606237bdf2527bd1203684efa8 |
|
MD5 | 1faa62fd3a3dccfa6a096c3ce9c77684 |
|
BLAKE2b-256 | aac3a99e0df1cb7a535018ba165239016315805c8a1f1a029f57f5d16492c6ee |