sefazetllib is a library that provides a simplified and abstracted way to construct ETL/ELT pipelines.
Project description
sefazetllib
Documentation: https://main.d32to2oidohzrl.amplifyapp.com/
Source code: AWS CodeCommit
sefazetllib is a library that provides a simplified and abstracted way to construct ETL/ELT pipelines.
Features
- Easy to use and understand library for constructing ETL/ELT pipelines.
- Compatibility with popular data processing frameworks, such as pandas and PySpark.
- Support for file formats such as CSV and Parquet.
- Provides the ability to extract, transform and load data with customizable configurations.
Requirements
sefazetllib requires the following to run:
Installation
Use pip to install sefazetllib:
pip install sefazetllib
Usage
Here is an example of how to use the sefazetllib:
from typing import Tuple
from pandas import DataFrame
from sefazetllib import Builder
from sefazetllib.etl import ETL
from sefazetllib.extract import ExtractLocal
from sefazetllib.factory.platform import PlatformFactory
from sefazetllib.load import LoadLocal
from sefazetllib.transform import Transform
from sefazetllib.utils.key import SurrogateKey
@Builder
class TestingDataFrame(Transform):
def execute(self) -> Tuple[str, DataFrame]:
return (
"dataframe",
DataFrame(
[["tom", 10], ["nick", 15], ["juli", 14]], columns=["Name", "Age"]
),
)
(
ETL()
.setPlatform(PlatformFactory("Pandas").create(name="test_pandas"))
.transform(TestingDataFrame)
.load(
LoadLocal()
.setFileFormat("parquet")
.setEntity("load_test")
.setMode("overwrite")
.setReference("dataframe")
.setDuplicates(True)
.setKey(SurrogateKey().setColumns(["Name", "Age"]).setDistribute(False))
)
.extract(
ExtractLocal()
.setFileFormat("parquet")
.setUrl("load_test.parquet")
.setReference("extract_test")
)
)
Testing
To run the unit tests, run the following command:
py -m unittest tests/main.py -v
License
sefazetllib is released under the Apache-2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sefazetllib-0.1.55.tar.gz
(33.1 kB
view hashes)
Built Distribution
Close
Hashes for sefazetllib-0.1.55-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ffee573cc8d38e96d5633b7436cf38d95a0fcc588bdf9df03acc513f8016e916 |
|
MD5 | 8ab1cb152412187d5081bb0261cb74d7 |
|
BLAKE2b-256 | 9071e8a99b5a59fe5d736c82c59ee3333ed2a0c01269ebec7459d8c8e198fe7c |