Tiny Block Operations for Data Pipelines
Project description
tiny-blocks
Tiny Blocks to build large and complex ETL data pipelines!
Tiny-Blocks is a library for data engineering operations.
Each pipeline is made out of tiny-blocks glued with the >>
operator.
This library relies on a fundamental streaming abstraction consisting of three
parts: extract, transform, and load. You can view a pipeline
as an extraction, followed by zero or more transformations, followed by a sink.
Visually, this looks like:
extract -> transform1 -> transform2 -> ... -> transformN -> load
You can also fan-in
, fan-out
for more complex operations.
extract1 -> transform1 -> |-> transform2 -> ... -> | -> transformN -> load1
extract2 ---------------> | | -> load2
Tiny-Blocks use generators to stream data. Each chunk is a Pandas DataFrame.
The chunksize
or buffer size is adjustable per pipeline.
Installation
Install it using pip
pip install tiny-blocks
Basic usage
from tiny_blocks.extract import FromCSV
from tiny_blocks.transform import Fillna
from tiny_blocks.load import ToSQL
# ETL Blocks
from_csv = FromCSV(path='/path/to/source.csv')
fill_na = Fillna(value="Hola Mundo")
to_sql = ToSQL(dsn_conn='psycopg2+postgres://...', table_name="sink")
# Pipeline
from_csv >> fill_na >> to_sql
Examples
For more complex examples please visit the notebooks' folder.
Documentation
Please visit this link for documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tiny_blocks-0.1.15.tar.gz
.
File metadata
- Download URL: tiny_blocks-0.1.15.tar.gz
- Upload date:
- Size: 13.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.9.10 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 963b99a1ea84e55a56aec99f10de102efd66756554e3831eb680142b7544d82e |
|
MD5 | 901aa9d1c0a7ccaba247fecfe4040945 |
|
BLAKE2b-256 | 9a469e401478d8636c9ea0f96c6ff22b1313eda591dcb8b4ab8b7b6ce0bd314e |
File details
Details for the file tiny_blocks-0.1.15-py3-none-any.whl
.
File metadata
- Download URL: tiny_blocks-0.1.15-py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.9.10 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59322ee078cb053939e768e80adf599cc401bdab6c6c37012cf73c8c56a389f9 |
|
MD5 | 0138d0b0a56c1a91f396238549b8a672 |
|
BLAKE2b-256 | 93bfce32b3ca2b8faeeae5deadc51db281f6e15ef2de708d98ddd81719f32151 |