sqlbucket

SQLBucket - Write your SQL ETL flow and ETL integrity tool.

These details have not been verified by PyPI

Project links

Homepage

Project description

https://travis-ci.org/socialpoint-labs/sqlbucket.svg?branch=master

SQLBucket was built to help write, orchestrate and validate SQL ETL. It gives the possibility to set variables and introduces some control flow like if/else and for loops, to make your queries dynamic when writing them. It also implements a very simplistic integration testing framework to validate the results of your ETL pipelines in the form of SQL checks.

Lightweight, it can work as a stand alone service, or be part of your workflow manager environment (Airflow, Luigi, ..).

Installing

Install and update using pip:

pip install -U sqlbucket

SQLBucket works only for Python 3.6 and 3.7, and probably 3.8 although not tested yet.

A Simple Example

To start working with SQLBucket, you need to have a ‘projects’ folder that will contain all your SQL ETL. SQLBucket, essentially works with the following folder structure as a root folder, where you can have as many projects as you want.

projects/
    |-- project1/
    |-- project2/
    |-- project3/
        ...

Inside a project, it must contain a folder called queries that will contains your SQL files to be ran, and a config.yaml that will let you set the order in which those queries must be processed. SQLBucket leverages on the amazing Jinja2 templating library to give you the possibility to set variables in your SQL as well as giving you pure execution flows like for loops. To see in more depth what can be done, see the documentation on how to write SQL in SQLBucket.

In practice, this is how a project folder would look.

projects/
    |-- my_super_project/
        |-- config.yaml
        |-- queries/
            |-- my_super_insert_query.sql
            |-- some_other_query.sql
        |-- integrity/
            |-- test1.sql
            |-- test2.sql

The integrity folder gives you the possibility to write some checks in SQL, that will ran at the end of your ETL to validate your data. Check documentation on integrity for a more detailed explanation on testing the integrity of your ETL.

This is how you would launch a project:

from sqlbucket import SQLBucket

connections = {
'db_test': 'postgresql://user:password@host:5439/database'
}

bucket = SQLBucket(connections=connections)
project = bucket.load_project(
    project_name='fat_etl',
    connection_name='db_test',
    variables={'foo': 1}
)

project.run()
project.run_integrity()

This would trigger logs as below.

You can also launch a project using the sqlbucket command line interface.

Contributing

For guidance on how to make a contribution to SQLBucket, see the contributing guidelines.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.4.4

Dec 1, 2022

0.4.3

Oct 18, 2022

0.4.2

Sep 28, 2022

0.4.1

Aug 27, 2020

0.4.0

Mar 27, 2020

0.3.3.dev3 pre-release

Mar 9, 2020

0.3.3.dev2 pre-release

Mar 9, 2020

0.3.3.dev1 pre-release

Mar 9, 2020

0.3.2

Feb 26, 2020

0.3.2.dev4 pre-release

Feb 9, 2020

0.3.2.dev3 pre-release

Feb 9, 2020

0.3.2.dev2 pre-release

Feb 9, 2020

0.3.2.dev1 pre-release

Feb 9, 2020

0.3.2.dev0 pre-release

Feb 9, 2020

0.3.1

Dec 4, 2019

0.3.0

Nov 27, 2019

0.2.12.dev3 pre-release

Nov 27, 2019

0.2.12.dev2 pre-release

Nov 27, 2019

0.2.12.dev1 pre-release

Nov 27, 2019

0.2.12.dev0 pre-release

Nov 27, 2019

0.2.11

Nov 26, 2019

0.2.11.dev1 pre-release

Nov 26, 2019

0.2.11.dev0 pre-release

Nov 22, 2019

0.2.10.dev9 pre-release

Nov 20, 2019

0.2.10.dev8 pre-release

Nov 20, 2019

0.2.10.dev7 pre-release

Nov 20, 2019

0.2.10.dev6 pre-release

Nov 20, 2019

0.2.10.dev4 pre-release

Nov 20, 2019

0.2.10.dev3 pre-release

Nov 20, 2019

This version

0.2.10.dev2 pre-release

Nov 20, 2019

0.2.10.dev1 pre-release

Nov 20, 2019

0.2.10.dev0 pre-release

Nov 20, 2019

0.2.9

Nov 19, 2019

0.2.8

Nov 19, 2019

0.2.7

Nov 19, 2019

0.2.6

Nov 19, 2019

0.2.5

Nov 18, 2019

0.2.4

Nov 6, 2019

0.2.3

Nov 6, 2019

0.2.2

Nov 6, 2019

0.2.1

Oct 10, 2019

0.1.21

Oct 8, 2019

0.1.20

Oct 8, 2019

0.1.19

Oct 8, 2019

0.1.18

Oct 8, 2019

0.1.17

Oct 8, 2019

0.1.16

Oct 8, 2019

0.1.15

Oct 8, 2019

0.1.14

Sep 12, 2019

0.1.13

Sep 12, 2019

0.1.12

Sep 12, 2019

0.1.11

Sep 12, 2019

0.1.10

Sep 11, 2019

0.1.9

Sep 11, 2019

0.1.8

Sep 11, 2019

0.1.7

Sep 6, 2019

0.1.6

Sep 6, 2019

0.1.5

Sep 6, 2019

0.1.4

Sep 6, 2019

0.1.3

Sep 6, 2019

0.1.2

Sep 6, 2019

0.1.1

Sep 6, 2019

0.1.1.dev0 pre-release

Sep 6, 2019

0.1.0

Sep 6, 2019

0.1.0.dev0 pre-release

Sep 6, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlbucket-0.2.10.dev2.tar.gz (11.8 kB view hashes)

Uploaded Nov 20, 2019 Source

Built Distribution

sqlbucket-0.2.10.dev2-py3-none-any.whl (13.4 kB view hashes)

Uploaded Nov 20, 2019 Python 3

Hashes for sqlbucket-0.2.10.dev2.tar.gz

Hashes for sqlbucket-0.2.10.dev2.tar.gz
Algorithm	Hash digest
SHA256	`34f79a62aca6f17455813c3890411c046229857a04885aee85ea3d63a064099d`
MD5	`c38f5a99c35aec26141421d6ee97f3f1`
BLAKE2b-256	`6aa91cd0ca2d9e0f374e23cb438f2c3b1f03d0198722a82fc7a685fe6c50b9ee`

Hashes for sqlbucket-0.2.10.dev2-py3-none-any.whl

Hashes for sqlbucket-0.2.10.dev2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f54b94ee84ba1f240d95349396d3c194f7946064af4acb6eb0c364523c37e1d`
MD5	`f004eee8dfc085153c50ceb9d451d9d4`
BLAKE2b-256	`1ec3bb1eed3da9a0915e60cc69f60aa7ec3dee9d24a1eec606d8004d4cd11370`