spark_dql_tools
Project description
spark_dql_mvp_tools
spark_dql_mvp_tools is a Python library that implements quality rules in sandbox
Installation
The code is packaged for PyPI, so that the installation consists in running:
Usage
wrapper create hammurabies MVP
Sandbox
Installation
!yes| pip uninstall spark-dql-mvp-tools
pip install spark-dql-mvptools --user --upgrade
IMPORTS
import os
import pyspark
from pyspark.sql import functions as func
from spark_generated_rules_tools import dq_path_workspace
from spark_generated_rules_tools import dq_generated_mvp
import spark_dataframe_tools
Variables
user_sandbox="P030772"
Creating Workspace
dq_path_workspace(user_sandbox=user_sandbox)
Run
table_raw_name = 't_klau_moe_adj_id_mthly_info'
table_master_name = 't_pmfi_moe_adj_id_mthly_info'
periodicity = 'Daily'
target_staging_path = '/in/staging/datax/klau/my_file_{?YEAR_MONTH}.csv'
is_uuaa_tag = False
dq_generated_mvp(table_master_name=table_master_name,
table_raw_name=table_raw_name,
periodicity=periodicity,
target_staging_path=target_staging_path,
is_uuaa_tag=is_uuaa_tag)
License
New features v1.0
BugFix
- choco install visualcpp-build-tools
Reference
- Jonathan Quiza github.
- Jonathan Quiza RumiMLSpark.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_dql_tools-0.5.4.tar.gz
(14.8 kB
view hashes)
Built Distribution
Close
Hashes for spark_dql_tools-0.5.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22587814925a2be574cc38934ebbb6aa7fb62f325e7daf97aae78e6d821e57bf |
|
MD5 | 3f31b2a417cc3f97ad0199b9128c740a |
|
BLAKE2b-256 | d463a3e39327eba259b7303f730a2affa8d8ee3bbf1e36d11b2aa5c5dd67ed8b |