spark_datax_schema_tools
Project description
spark_datax_schema_tools
spark_datax_schema_tools is a Python library that implements for dataX schemas
Installation
The code is packaged for PyPI, so that the installation consists in running:
pip install spark-datax-schema-tools
Usage
wrapper take schemas for DataX
example1: (generate dummy_data)
================================
from spark_datax_schema_tools import generate_components
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_components(spark=spark,
path_excel="/content/Summary RQ22021-HF1.xlsx",
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom")
df2.show2()
example2: (generate transmission detail with schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom",
table_version="0",
frequency="monthly",
group="CIB",
solution_model="CDD",
path_excel="Summary RQ22021-HF1.xlsx")
example3: (generate transmission detail without schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom",
table_version="0",
frequency="monthly",
group="CIB",
solution_model="CDD")
Parameter functions
===================
generate_transmission_holding:
frequency: ["daily", "monthly"]
group : ["CIB", "CLIENT_SOLUTIONS", "CORE_BANKING", "GLOBAL_DATA", "RISK_FINANCE"]
solution_model: ["CIB", "CDD"]
License
New features v1.0
BugFix
- choco install visualcpp-build-tools
Reference
- Jonathan Quiza github.
- Jonathan Quiza RumiMLSpark.
- Jonathan Quiza linkedin.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file spark_datax_schema_tools-0.0.43.tar.gz
.
File metadata
- Download URL: spark_datax_schema_tools-0.0.43.tar.gz
- Upload date:
- Size: 15.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64c7231efd9060e4cbc607b3eabb4efd07abdf53ecd20d7fb880ce5a9d03e0f6 |
|
MD5 | 88c62c9b36821dd09269e042d98c9aa0 |
|
BLAKE2b-256 | 0d53b337ecbfdd6e33c7ee56bffc469fb56b435b634a12d841d7e9f90e506e27 |
File details
Details for the file spark_datax_schema_tools-0.0.43-py3-none-any.whl
.
File metadata
- Download URL: spark_datax_schema_tools-0.0.43-py3-none-any.whl
- Upload date:
- Size: 15.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b27c3ce863bcd1fa5233ed3c4d2a0ed22f57f863e11653893faf9f31696f37b |
|
MD5 | 697251a7eb050992b1bd7c43c35681c5 |
|
BLAKE2b-256 | 8b7889dfe8bc85584655e45ca1cdb9501d18db51149a9add38442a695d0f361e |