Awesome spark_jdbc_profiler created by hgbink
Project description
Spark JDBC Profiler
Spark JDBC Profiler is a collection of utils functions for profiling source databases with spark jdbc connections.
Install it from PyPI
pip install spark_jdbc_profiler
Usage
from spark_jdbc_profiler.whole_db_profiler.mysql_db_profiler import *
from spark_jdbc_profiler.segmentation_profiler.segmentation_gen import *
jdbcUsername = "test_user"
jdbcPassword = "test_pass"
jdbcHostname = "mariadb"
jdbcPort = "3306"
jdbcDatabase = "test"
jdbcUrl = f"jdbc:mysql://{jdbcHostname}:{jdbcPort}/{jdbcDatabase}?zeroDateTimeBehavior=ROUND"
connectionProperties = {"user": jdbcUsername, "password": jdbcPassword}
df = profile_whole_db(spark, jdbcUrl, connectionProperties)
df.show(n=20)
Development
Read the CONTRIBUTING.md file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for spark_jdbc_profiler-1.0.4.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fcd8ed68f65aca20aa923f494a461e0ae64f180ee75b185db0f498a58b2b6e3 |
|
MD5 | d1a2860929c84ad956200caa5c578882 |
|
BLAKE2b-256 | 3942003e936be4a058c43781abe1eac697fd4e92579151bde8906fa60d1cd573 |
Close
Hashes for spark_jdbc_profiler-1.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3e13032212e691197e3d437dae0f198a644fff10778d910f1eaa9eefdba6042 |
|
MD5 | 393c142f6f740e4394849b9255901c28 |
|
BLAKE2b-256 | b3399c1c12164e762c892d8b9e11412d977f9b92ae9194278ea2f13e0c83b615 |