Skip to main content

Core ETL pipeline framework for mkpipe.

Project description

MkPipe

MkPipe is a modular, open-source ETL (Extract, Transform, Load) tool that allows you to integrate various data sources and sinks easily. It is designed to be extensible with a plugin-based architecture that supports extractors, transformers, and loaders.

Features

  • Extract data from multiple sources (e.g., PostgreSQL, MongoDB).
  • Transform data using custom Python logic and Apache Spark.
  • Load data into various sinks (e.g., ClickHouse, PostgreSQL, Parquet).
  • Plugin-based architecture that supports future extensions.
  • Cloud-native architecture, can be deployed on Kubernetes and other environments.

Installation

You can install the core package and extractors using pip:

Install the core package:

pip install mkpipe

Install the Postgres extractor:

pip install mkpipe-extractor-postgres

Install additional extractors or loaders as needed:

You can find or contribute new extractors and loaders in the future.

Usage

To run the ETL process, use the following command:

from mkpipe_core.plugins.registry import EXTRACTORS

def test_postgres_extractor():
    postgres_extractor = EXTRACTORS.get("postgres")
    if not postgres_extractor:
        print("Postgres extractor not found!")
        return
    instance = postgres_extractor()
    instance.extract()

if __name__ == "__main__":
    test_postgres_extractor()

Where elt.yaml is your configuration file that specifies the extractors, transformers, and loaders.

Documentation

For more detailed documentation, please visit the GitHub repository.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mkpipe-0.1.15.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mkpipe-0.1.15-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file mkpipe-0.1.15.tar.gz.

File metadata

  • Download URL: mkpipe-0.1.15.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.10

File hashes

Hashes for mkpipe-0.1.15.tar.gz
Algorithm Hash digest
SHA256 357eefe11ab927ae6596a247f96e80de801e7eace64c322c856e81850d0a5290
MD5 85dc6d78c06ddb0f140ae224485ca65f
BLAKE2b-256 5a90507e3f7764574a14362fc4e53993f231c5bba2d890fe4b4ad33c2659c17a

See more details on using hashes here.

File details

Details for the file mkpipe-0.1.15-py3-none-any.whl.

File metadata

  • Download URL: mkpipe-0.1.15-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.10

File hashes

Hashes for mkpipe-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 6c449f0ea7f2e695c6ba879303b3e3bae914e71d7d03644efc6065133af348d3
MD5 ac18a93db816c912fc7c4c37cb1f1f10
BLAKE2b-256 fb38c04ffd592bc1ec91d144cad1417c353f6aeda7990a6161d521d26ae25b94

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page