Skip to main content

Tiny Block Operations for Data Pipelines

Project description

tiny-blocks

Documentation Status License-MIT GitHub Actions PyPI version

Tiny Blocks to build large and complex ETL data pipelines!

Tiny-Blocks is a library for data engineering operations. Each pipeline is made out of tiny-blocks glued with the >> operator. This library relies on a fundamental streaming abstraction consisting of three parts: extract, transform, and load. You can view a pipeline as an extraction, followed by zero or more transformations, followed by a sink. Visually, this looks like:

extract -> transform1 -> transform2 -> ... -> transformN -> load

You can also fan-in, fan-out for more complex operations.

extract1 -> transform1 -> |-> transform2 -> ... -> | -> transformN -> load1
extract2 ---------------> |                        | -> load2

Tiny-Blocks use generators to stream data. Each chunk is a Pandas DataFrame. The chunksize or buffer size is adjustable per pipeline.

Installation

Install it using pip

pip install tiny-blocks

Basic usage

from tiny_blocks.extract import FromCSV
from tiny_blocks.transform import Fillna
from tiny_blocks.load import ToSQL

# ETL Blocks
from_csv = FromCSV(path='/path/to/source.csv')
fill_na = Fillna(value="Hola Mundo")
to_sql = ToSQL(dsn_conn='psycopg2+postgres://...', table_name="sink")

# Pipeline
from_csv >> fill_na >> to_sql

Examples

For more complex examples please visit the notebooks' folder.

Documentation

Please visit this link for documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiny_blocks-0.1.15.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

tiny_blocks-0.1.15-py3-none-any.whl (24.7 kB view details)

Uploaded Python 3

File details

Details for the file tiny_blocks-0.1.15.tar.gz.

File metadata

  • Download URL: tiny_blocks-0.1.15.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.10 Darwin/21.6.0

File hashes

Hashes for tiny_blocks-0.1.15.tar.gz
Algorithm Hash digest
SHA256 963b99a1ea84e55a56aec99f10de102efd66756554e3831eb680142b7544d82e
MD5 901aa9d1c0a7ccaba247fecfe4040945
BLAKE2b-256 9a469e401478d8636c9ea0f96c6ff22b1313eda591dcb8b4ab8b7b6ce0bd314e

See more details on using hashes here.

File details

Details for the file tiny_blocks-0.1.15-py3-none-any.whl.

File metadata

  • Download URL: tiny_blocks-0.1.15-py3-none-any.whl
  • Upload date:
  • Size: 24.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.10 Darwin/21.6.0

File hashes

Hashes for tiny_blocks-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 59322ee078cb053939e768e80adf599cc401bdab6c6c37012cf73c8c56a389f9
MD5 0138d0b0a56c1a91f396238549b8a672
BLAKE2b-256 93bfce32b3ca2b8faeeae5deadc51db281f6e15ef2de708d98ddd81719f32151

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page