Tiny Block Operations for Data Pipelines
Project description
tiny-blocks
Tiny Blocks to build large and complex pipelines!
Tiny-Blocks is a library for streaming operations, composed using the >>
operator. This allows for easy extract, transform and load operations.
Pipeline Components: Sources, Pipes, and Sinks
This library relies on a fundamental streaming abstraction consisting of three parts: extract, transform, and load. You can view a pipeline as a extraction, followed by zero or more transformations, followed by a sink. Visually, this looks like:
source >> pipe1 >> pipe2 >> pipe3 >> ... >> pipeN >> sink
Installation
Install it using pip
pip install tiny-blocks
Basic usage example
from tiny_blocks.extract import FromCSV
from tiny_blocks.transform import DropDuplicates
from tiny_blocks.transform import Fillna
from tiny_blocks.load import ToSQL
from tiny_blocks import Pipeline
# ETL Blocks
from_csv = FromCSV(path='/path/to/file.csv')
drop_duplicates = DropDuplicates()
fill_na = Fillna(value="Hola Mundo")
to_sql = ToSQL(dsn_conn='psycopg2+postgres://...')
# Run the Pipeline
with Pipeline(name="Pipeline") as pipe:
pipe >> from_csv >> drop_duplicates >> fill_na >> to_sql
Documentation
Please visit this link for documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tiny_blocks-0.1.1.tar.gz
(11.8 kB
view hashes)
Built Distribution
Close
Hashes for tiny_blocks-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 127095da13c1079b1843a5f9afb049ee07f3fb11ff0fd69ea20c1cc0433eee73 |
|
MD5 | abefaceadd91601e472a7369c184281e |
|
BLAKE2b-256 | 866c0b63739de79d3bd058422aa712a1bc358a29bb2b04d4236e53007b4649be |