Skip to main content

Lightweight assertions inspired by the great-expectations library

Project description

serialbandicoot flake8 Lint codecov CodeQL

This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.

Install

pip install great-assertions

Code example Pandas

from great_assertions import GreatAssertions
import pandas as pd

class GreatAssertionTests(GreatAssertions):
    def test_expect_table_row_count_to_equal(self):
        df = pd.DataFrame({"col_1": [100, 200, 300], "col_2": [10, 20, 30]})
        self.expect_table_row_count_to_equal(df, 3)

Code example PySpark

from great_assertions import GreatAssertions
from pyspark.sql import SparkSession

class GreatAssertionTests(GreatAssertions):

    def setUp(self):
        self.spark = SparkSession.builder.getOrCreate()

    def test_expect_table_row_count_to_equal(self):
        df = self.spark.createDataFrame(
            [
                {"col_1": 100, "col_2": 10},
                {"col_1": 200, "col_2": 20},
                {"col_1": 300, "col_2": 30},
            ]
        )
        self.expect_table_row_count_to_equal(df, 3)

List of available assertions

Pandas

PySpark

expect_table_row_count_to_equal

white_check_mark::

white_check_mark::

expect_table_row_count_to_be_greater_than

white_check_mark::

white_check_mark::

expect_table_row_count_to_be_less_than

white_check_mark::

white_check_mark::

expect_table_has_no_duplicate_rows

white_check_mark::

white_check_mark::

expect_column_value_to_equal

white_check_mark::

white_check_mark::

expect_column_values_to_be_between

white_check_mark::

white_check_mark::

expect_column_values_to_match_regex

white_check_mark::

white_check_mark::

expect_column_values_to_be_in_set

white_check_mark::

white_check_mark::

expect_column_values_to_be_of_type

white_check_mark::

white_check_mark::

expect_table_columns_to_match_ordered_list

white_check_mark::

white_check_mark::

expect_table_columns_to_match_set

white_check_mark::

white_check_mark::

expect_date_range_to_be_more_than

white_check_mark::

white_check_mark::

expect_date_range_to_be_less_than

white_check_mark::

white_check_mark::

expect_date_range_to_be_between

white_check_mark::

white_check_mark::

expect_column_mean_to_be_between

white_check_mark::

white_check_mark::

expect_column_value_counts_percent_to_be_between

white_check_mark::

white_check_mark::

expect_frame_equal

white_check_mark::

white_check_mark::

expect_column_has_no_duplicate_rows

white_check_mark::

white_check_mark::

Assertion Descriptions

For a description of the assertions see Assertion Definitions

Running the tests

Executing the tests still require unittest, the following options have been tested with the examples provided.

Option 1

import unittest
suite = unittest.TestLoader().loadTestsFromTestCase(GreatAssertionTests)
runner = unittest.TextTestRunner(verbosity=2)
runner.run(suite)

Options 2

if __name__ == '__main__':
    unittest.main()

Pie Charts and Tables

For a more visual representation of the results, when using in Databricks or Jupyter Notebooks. The results can be outputed as tables or pie-chart.

import unittest
from great_assertions import GreatAssertionResult, GreatAssertions

class DisplayTest(GreatAssertions):
    def test_pass1(self):
        assert True is True

    def test_fail(self):
        assert "Hello" == "World"

suite = unittest.TestLoader().loadTestsFromTestCase(DisplayTest)
test_runner = unittest.runner.TextTestRunner(resultclass = GreatAssertionResult)
result = test_runner.run(suite)

result.to_barh() #Also available: result.to_pie()
Bar Horizonal
result.to_results_table()
Results Table
result.to_full_results_table()
Full Results Table

Runnng with XML-Runner

To run with xml-runner, there is no difference to how it’s currently used. However you will not be able to get method like to_results_table as these use a different resultclass

import xmlrunner
suite = unittest.TestLoader().loadTestsFromTestCase(DisplayTest)
test_runner = xmlrunner.XMLRunner(output="test-results")
test_runner.run(suite)

Notes

If you get an arrows function warning when running in Databricks, this will happen becuase a toPandas() method is being used for many of the assertions. The plan is to remove pandas conversion for pure PySpark code. If this is an issue please raise an issue so this method can be prioritised. For now its advisable to makre sure the datasets are not too big, which cause the driver to crash.

Development

To create a development environment, create a virtualenv and make a development installation

virtualenv ve
source ve/bin/activate

To run tests, just use pytest

(ve) pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

great-assertions-0.0.61.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

great_assertions-0.0.61-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file great-assertions-0.0.61.tar.gz.

File metadata

  • Download URL: great-assertions-0.0.61.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for great-assertions-0.0.61.tar.gz
Algorithm Hash digest
SHA256 26bb85d769d59f25e19c84e32ecb1d7d8c9856460a841e8f7497d8733ddf498b
MD5 d1736f941426485d0b08f93249d89e98
BLAKE2b-256 2930c90f9e238758aed6994358d13ea914cc013170d35b7659af930f86fb2431

See more details on using hashes here.

File details

Details for the file great_assertions-0.0.61-py3-none-any.whl.

File metadata

  • Download URL: great_assertions-0.0.61-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for great_assertions-0.0.61-py3-none-any.whl
Algorithm Hash digest
SHA256 8fbbb94c4edd327c59e62d879be73d535c5d2ce8e1fd2924ba992f603712d65a
MD5 e9e527e0e699ffdd8da262dc4ca78e16
BLAKE2b-256 6e70802a052fda5a4c3f1c8e1d5c8a24479b9e9d3be28f99d899ad0aefafd98c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page