Skip to main content

A microframework for simple ETL solutions

Project description

[![Documentation Status](https://readthedocs.org/projects/bert-etl/badge/?version=latest)](https://bert-etl.readthedocs.io/en/latest/?badge=latest)

# Bert A microframework for simple ETL solutions

## Begin with

Lets begin with an example of loading data from a file-server and than loading it into numpy arrays

` $ virtualenv -p $(which python3) env $ source env/bin/activate $ pip install bert-etl $ pip install librosa # for demo project $ docker run -p 6379:6379 -d redis # bert-etl runs on redis to share data across CPUs $ bert-runner.py -n demo $ PYTHONPATH='.' bert-runner.py -m demo -j sync_sounds -f `

## Release Notes

### 0.3.0

  • Added Error Management. When an error occurs, bert-runner will log the error and re-run the job. If the same error happens often enough, the job will be aborted

### 0.2.1

  • Added Release Notes

### 0.2.0

  • Added Redis Service auto run. Using docker, redis will be pulled and started in the background

  • Added Redis Service channels, sometimes you’ll want to run to etl-jobs on the same machine

## Fund Bounty Target Upgrades

Bert provides a boiler plate framework that’ll allow one to write concurrent ETL code using Pythons’ microprocessing module. One function starts the process, piping data into a Redis backend that’ll then be consumed by the next function. The queues are respectfully named for the scope of the function: Work(start) and Done(end) queue. Please consider contributing to Bert Bounty Targets to improve this documentation

https://www.patreon.com/jbcurtin

## Roadmap

  • Create configuration file, bert-etl.yaml

  • Support conda venv

  • Support pyenv venv

  • Support dynamodb flush

  • Support multipule invocations per AWS account

  • Support undeploy AWS Lambda

  • Support Bottle functions in AWS Lambda

## Tutorial Roadmap

  • Introduce Bert API

  • Explain bert.binding

  • Explain comm_binder

  • Explain work_queue

  • Explain done_queue

  • Explain ologger

  • Explain DEBUG and how turning it off allows for x-concurrent processes

  • Show an example on how to load timeseries data, calcualte the mean, and display the final output of the mean

  • Expand the example to show how to scale the application implicitly

  • Show how to run locally using Redis

  • Show how to run locally without Redis, using Dynamodb instead

  • Show how to run remotly using AWSLambda and Dynamodb

  • Talk about dynamodb and eventual consistency

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bert-etl-0.4.13.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bert_etl-0.4.13-py2.py3-none-any.whl (31.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file bert-etl-0.4.13.tar.gz.

File metadata

  • Download URL: bert-etl-0.4.13.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.7

File hashes

Hashes for bert-etl-0.4.13.tar.gz
Algorithm Hash digest
SHA256 84f17c402cbc78893d16672916a292652c19e0c677f476f7a14867c6d291f495
MD5 a68fc829376a4ed63261fa6c291050ff
BLAKE2b-256 3e8906e56eff2c6ebee76ace4efc288688a032f9c8d0e9868c911b86bd753e53

See more details on using hashes here.

File details

Details for the file bert_etl-0.4.13-py2.py3-none-any.whl.

File metadata

  • Download URL: bert_etl-0.4.13-py2.py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.7

File hashes

Hashes for bert_etl-0.4.13-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ec3b91f47ad5e3ec16c0071711907b170df11305f3d27219ea9a2b385ef94de7
MD5 37e52c5bb800c3606dc20cee9bba4397
BLAKE2b-256 2bc0454578bfa16f31c66fe5735dcffdd5065efb8da9728cb5193f0324e8c35b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page