Skip to main content

A way to use Great Expectations in Google Collab and other notebook environments.

Project description

GXKent

A simple library that allows Great Expectations to run easily in python notebook and CLI environments

Idea

Kent was the city featured in the Charles Dickens classic, and is therefore the sensible name for a container of expectations The central issue that Kent resolves is to ensure that pandas dataframes are available and populated with data in both of our data contexts: CLI and Notebooks

Basic Usage

from gxkent import GXKent

Kent = GXKent() # Databricks usage
# Google Colab usage
# Kent = GXKent(password_worksheet='your_worksheet_name_here') 
# Command line usage
# Kent = GXKent(env_path='/where/you/put/your/.env') 

Kent.is_print_on_success = False

  sql_text =
SELECT count(DISTINCT a.npi) AS new_npi_cnt
FROM default_npi_setting_count.{table_name} a
WHERE a.npi NOT IN (
    SELECT DISTINCT b.npi
    FROM default_npi_setting_count.persetting_2021_12 b
    );

gxDF = Kent.gx_df_from_sql(sql_text)

Kent.capture_expectation(
    expectation_name='Between year comparision {this_year} {that_year}',
    expectation_result=gxDF.expect_column_max_to_be_between('new_npi_cnt',112671,253511)
)

Kent.capture_expectation(
    expectation_name='Between year comparision {this_year} {that_year}',
    expectation_result=gxDF.expect_column_min_to_be_between('new_npi_cnt',11671,23511)
)

Kent.capture_expectation(
    expectation_name='Between year comparision {this_year} {that_year}',
    expectation_result=gxDF.expect_column_avg_to_be_between('new_npi_cnt',50000,60000)
)

# Prints the results to the console! 
Kent.print_test_results()

CLI Usage

In order to work from the command line, there should be a .env file with database credentials in it. As typical .env files should be excluded in your .gitignore file.

here is the contents expected in the .env file:

GX_USERNAME=your_gx_mysql_username
GX_PASSWORD=your_gx_mysql_password
DB_DATABASE=starting_database
DB_PORT=3306
DB_HOST=localhost

Once your .env file has the right contents, you need to tell GXkent where it lives when you create your object with:

Kent = GXKent(env_path='/where/you/put/your/.env')

substitute your database connection details here. For now, GXKent only supports MySQL. patches to use sqlalchemy properly to support other databases are welcome.

Google Colab Usage

In order to safely use Google Collab notebooks, it is critical to not save your password credentials in the notebook itself. Instead you should store your credentials in a google spreadsheet and then connect your Collab notebook to that spreadsheet.

In order to use GXKent in this way, you need to pass in the credentials like so:

Kent = GXKent(password_worksheet='your_worksheet_name_here')

Your Google Drive Spreadsheet should contain the following structure:

username password server port database
your_username your_password your_server 3306 your_database

Authors

Fred Trotter and Jose Cortina

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gxkent-0.2.8.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gxkent-0.2.8-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file gxkent-0.2.8.tar.gz.

File metadata

  • Download URL: gxkent-0.2.8.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.4

File hashes

Hashes for gxkent-0.2.8.tar.gz
Algorithm Hash digest
SHA256 0e327ea2effde22829c87cdedf84387b03e95f8768b40f63c54e0e6f0647804b
MD5 6804ffd1d954fe68cc5e50c50d61aa14
BLAKE2b-256 0fb1602685f6860ed49362239180ae5a4406f7e0860cbf8047d2759861c70ada

See more details on using hashes here.

File details

Details for the file gxkent-0.2.8-py3-none-any.whl.

File metadata

  • Download URL: gxkent-0.2.8-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.4

File hashes

Hashes for gxkent-0.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 a25b2a01d726bbacbd9a98b8833fe77c19777cd713864df1098a0d22cc206c89
MD5 47187fa849fa55d8d9a968aab9d9133c
BLAKE2b-256 a2cf1a93d7fe99e0cdca67d6110d5434b06289a5243d825be4709dc5a264ae15

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page