A PySpark ETL Framework
Project description
Overview
PySetl is a framework focused to improve readability and structure of PySpark ETL projects. Also, it is designed to take advantage of Python’s typing syntax to reduce runtime errors through linting tools and verifying types at runtime. Thus, effectively enhacing stability for large ETL pipelines.
In order to accomplish this task we provide some tools:
pysetl.config: Type-safe configuration.
pysetl.storage: Agnostic and extensible data sources connections.
pysetl.workflow: Pipeline management and dependency injection.
PySetl is designed with Python typing syntax at its core. Hence, we strongly suggest typedspark and pydantic for development.
Why use PySetl?
Model complex data pipelines.
Reduce risks at production with type-safe development.
Improve large project structure and readability.
Installation
PySetl is available in PyPI:
pip install pysetl
PySetl doesn’t list pyspark as dependency since most environments have their own Spark environment. Nevertheless, you can install pyspark running:
pip install "pysetl[pyspark]"
Acknowledgments
PySetl is a port from SETL. We want to fully recognise this package is heavily inspired by the work of the SETL team. We just adapted things to work in Python.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pysetl-0.1.7rc0.tar.gz
.
File metadata
- Download URL: pysetl-0.1.7rc0.tar.gz
- Upload date:
- Size: 32.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.12.0 Darwin/23.0.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c9e838a201e150d902e8494ad1f2fa5ed0d800c073130c3d354bc0f14a43e72 |
|
MD5 | 0cd9a7bb7767ddead7fdea49ef6193a7 |
|
BLAKE2b-256 | 970a78c8ba2027042c39715017b4871c47f833d1df870408933e1d76d1a33dfc |
File details
Details for the file pysetl-0.1.7rc0-py3-none-any.whl
.
File metadata
- Download URL: pysetl-0.1.7rc0-py3-none-any.whl
- Upload date:
- Size: 51.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.12.0 Darwin/23.0.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0e2f1d64ba3cf79ae3a87208cfe8be9c569a2bfebf8c0ac988c8be1d5782949 |
|
MD5 | d95799c82dfa13fa05c93a5c24cd5f1e |
|
BLAKE2b-256 | 04e38455af95c37e469ddfeecc9e2298a4ab4f660627d45a5ada7bc8cf0f9fc1 |