spark_datax_schema_tools
Project description
spark_datax_schema_tools
spark_datax_schema_tools is a Python library that implements for dataX schemas
Installation
The code is packaged for PyPI, so that the installation consists in running:
pip install spark-datax-schema-tools
Usage
wrapper take schemas for DataX
example1: (generate dummy_data)
================================
from spark_datax_schema_tools import generate_components
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_components(spark=spark,
path_excel="/content/Summary RQ22021-HF1.xlsx",
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom")
df2.show2()
example2: (generate transmission detail with schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom",
table_version="0",
frequency="monthly",
group="CIB",
solution_model="CDD",
path_excel="Summary RQ22021-HF1.xlsx")
example3: (generate transmission detail without schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom",
table_version="0",
frequency="monthly",
group="CIB",
solution_model="CDD")
Parameter functions
===================
generate_transmission_holding:
frequency: ["daily", "monthly"]
group : ["CIB", "CLIENT_SOLUTIONS", "CORE_BANKING", "GLOBAL_DATA", "RISK_FINANCE"]
solution_model: ["CIB", "CDD"]
License
New features v1.0
BugFix
- choco install visualcpp-build-tools
Reference
- Jonathan Quiza github.
- Jonathan Quiza RumiMLSpark.
- Jonathan Quiza linkedin.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for spark_datax_schema_tools-0.0.42.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8a983a55166c9160dba0f70b041435bd5c6e1e36d9238b77cad3fe373e8e29a |
|
MD5 | dbd668dc74cb4a098b8d050eeeac8a55 |
|
BLAKE2b-256 | 56c14d1cb2c1b5b97412155ce68a998908cb25ef5f2297cfbe7e4c7cf8bbf9dc |
Close
Hashes for spark_datax_schema_tools-0.0.42-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d09259bb36d940e5ba1ae9cc7be030c06be1b0451c5d64c33dde8c9816051b19 |
|
MD5 | 79d2f9f6fb5c4d54c2b10e0182f7aa58 |
|
BLAKE2b-256 | 71e97c801375ea3d3fffbcc90d6699920615d86ee007d19e19193f892d8c7bb8 |