Skip to main content

Python-only framework to easily build ETLs.

Project description

cupyd

                                                  __     
                                                 /\ \    
  ___       __  __      _____       __  __       \_\ \   
 /'___\    /\ \/\ \    /\ '__`\    /\ \/\ \      /'_` \  
/\ \__/    \ \ \_\ \   \ \ \L\ \   \ \ \_\ \    /\ \L\ \ 
\ \____\    \ \____/    \ \ ,__/    \/`____ \   \ \___,_\
 \/____/     \/___/      \ \ \/      `/___/> \   \/__,_ /
                          \ \_\         /\___/           
                           \/_/         \/__/

Python framework to create your own ETLs.

Features

  • Simple but powerful syntax.
  • Modular approach that encourages re-using components across different ETLs.
  • Parallelism out-of-the-box without the need of writing multiprocessing code.
  • Very compatible:
    • Runs on Unix, Windows & MacOS.
    • Python >= 3.9
  • Lightweight:
    • No dependencies for its core version.
    • API version will require Falcon, which is a minimalist ASGI/WSGI framework that doesn't require other packages to work.
    • The Dashboard (full) version will require Falcon and Dash.

Usage

In this example we will compute the factorial of 50.000 integers, using multiprocessing, while storing the results into 2 separate lists, one of even results and another for odd ones.

import math
from typing import Any

from cupyd import ETL, Extractor, Transformer, Loader, Filter


class IntegerExtractor(Extractor):

    def __init__(self, total_items: int):
        super().__init__()
        self.total_items = total_items

        # generated integers will be passed onto each worker in buckets of size 10
        self.configuration.bucket_size = 10

    def extract(self) -> int:
        for item in range(self.total_items):
            yield item


class Factorial(Transformer):

    def transform(self, item: int) -> int:
        return math.factorial(item)


class EvenOnly(Filter):

    def filter(self, item: int) -> int | None:
        return item if item & 1 else None


class OddOnly(Filter):

    def filter(self, item: int) -> int | None:
        return None if item & 1 else item


class ListLoader(Loader):

    def __init__(self):
        super().__init__()
        self.configuration.run_in_main_process = True
        self.items = []

    def start(self):
        self.items = []

    def load(self, item: Any):
        self.items.append(item)


if __name__ == "__main__":
    # 1. Define the ETL Nodes
    ext = IntegerExtractor(total_items=50_000)
    factorial = Factorial()
    even_only = EvenOnly()
    odd_only = OddOnly()
    even_ldr = ListLoader()
    odd_ldr = ListLoader()

    # 2. Connect the Nodes to determine the data flow. Notice the ETL branches after the
    # factorial is computed
    ext >> factorial >> [even_only >> even_ldr, odd_only >> odd_ldr]

    # 3. Run the ETL with 8 workers (multiprocessing Processes)
    etl = ETL(extractor=ext)
    etl.run(workers=8, show_progress=True, monitor_performance=True)

    # 4. You can access the results stored in both Loaders after the ETL is finished
    even_factorials = even_ldr.items
    odd_factorials = odd_ldr.items

For more information, go the examples directory


💘 (Project under construction)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cupyd-0.1.0.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

cupyd-0.1.0-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file cupyd-0.1.0.tar.gz.

File metadata

  • Download URL: cupyd-0.1.0.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cupyd-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3d582fb17573a4477c13dee52487973384de884678dc627f66d2bb49018d1ed4
MD5 d6ce148df04b67e290bd0feb8363d0ef
BLAKE2b-256 ab7950b381a5918eed2344b6fcfd72329b27defd7fc09850cc05a0662c124fd2

See more details on using hashes here.

File details

Details for the file cupyd-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cupyd-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cupyd-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8445a8397ac33f2e8d7ff34d8dabb00fe85e73b13c2276cd1ce056e6ca958247
MD5 2e19a66fcd7bfb90857c9cd9f78e9778
BLAKE2b-256 b60708547d448242b1f2205a6aa00e0aa957850b7718742eb1b9fbf3edcd1811

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page