spark_dql_tools
Project description
spark_dql_mvp_tools
spark_dql_mvp_tools is a Python library that implements quality rules in sandbox
Installation
The code is packaged for PyPI, so that the installation consists in running:
Usage
wrapper create hammurabies MVP
Sandbox
Installation
!yes| pip uninstall spark-dql-mvp-tools
pip install spark-dql-mvptools --user --upgrade
IMPORTS
import os
import pyspark
from pyspark.sql import functions as func
from spark_generated_rules_tools import dq_path_workspace
from spark_generated_rules_tools import dq_generated_mvp
import spark_dataframe_tools
Variables
user_sandbox="P030772"
Creating Workspace
dq_path_workspace(user_sandbox=user_sandbox)
Run
table_raw_name = 't_klau_moe_adj_id_mthly_info'
table_master_name = 't_pmfi_moe_adj_id_mthly_info'
periodicity = 'Daily'
target_staging_path = '/in/staging/datax/klau/my_file_{?YEAR_MONTH}.csv'
is_uuaa_tag = False
dq_generated_mvp(table_master_name=table_master_name,
table_raw_name=table_raw_name,
periodicity=periodicity,
target_staging_path=target_staging_path,
is_uuaa_tag=is_uuaa_tag)
License
New features v1.0
BugFix
- choco install visualcpp-build-tools
Reference
- Jonathan Quiza github.
- Jonathan Quiza RumiMLSpark.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_dql_tools-0.2.4.tar.gz
(14.4 kB
view hashes)
Built Distribution
Close
Hashes for spark_dql_tools-0.2.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21c866ba05267da0681f0f5bc7742c7af4e563b6640adb7f13b5017ea521a7c3 |
|
MD5 | 3a3eea4ce5edd67d1d3678982438e5c7 |
|
BLAKE2b-256 | 457be01e2376ca23cd3762cb3350e2b453eb105958d6db1aab6b2ddd9d10eaef |