Lightweight assertions inspired by the great-expectations library
Project description
This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.
Install
pip install great-assertions
Code example Pandas
from great_assertions import GreatAssertions
import pandas as pd
class GreatAssertionTests(GreatAssertions):
def test_expect_table_row_count_to_equal(self):
df = pd.DataFrame({"col_1": [100, 200, 300], "col_2": [10, 20, 30]})
self.expect_table_row_count_to_equal(df, 3)
Code example PySpark
from great_assertions import GreatAssertions
from pyspark.sql import SparkSession
class GreatAssertionTests(GreatAssertions):
def setUp(self):
self.spark = SparkSession.builder.getOrCreate()
def test_expect_table_row_count_to_equal(self):
df = self.spark.createDataFrame(
[
{"col_1": 100, "col_2": 10},
{"col_1": 200, "col_2": 20},
{"col_1": 300, "col_2": 30},
]
)
self.expect_table_row_count_to_equal(df, 3)
List of available assertions
Pandas |
PySpark |
|
---|---|---|
expect_table_row_count_to_equal |
|
|
expect_table_row_count_to_be_greater_than |
|
|
expect_table_row_count_to_be_less_than |
|
|
expect_table_has_no_duplicate_rows |
|
|
expect_column_value_to_equal |
|
|
expect_column_values_to_be_between |
|
|
expect_column_values_to_match_regex |
|
|
expect_column_values_to_be_in_set |
|
|
expect_column_values_to_be_of_type |
|
|
expect_table_columns_to_match_ordered_list |
|
|
expect_table_columns_to_match_set |
|
|
expect_date_range_to_be_more_than |
|
|
expect_date_range_to_be_less_than |
|
|
expect_date_range_to_be_between |
|
|
expect_column_mean_to_be_between |
|
|
expect_column_value_counts_percent_to_be_between |
|
|
expect_frame_equal |
|
|
expect_column_has_no_duplicate_rows |
|
|
Assertion Descriptions
For a description of the assertions see Assertion Definitions
Running the tests
Executing the tests still require unittest, the following options have been tested with the examples provided.
Option 1
import unittest
suite = unittest.TestLoader().loadTestsFromTestCase(GreatAssertionTests)
runner = unittest.TextTestRunner(verbosity=2)
runner.run(suite)
Options 2
if __name__ == '__main__':
unittest.main()
Pie Charts and Tables
For a more visual representation of the results, when using in Databricks or Jupyter Notebooks. The results can be outputed as tables or pie-chart.
import unittest
from great_assertions import GreatAssertionResult, GreatAssertions
class DisplayTest(GreatAssertions):
def test_pass1(self):
assert True is True
def test_fail(self):
assert "Hello" == "World"
suite = unittest.TestLoader().loadTestsFromTestCase(DisplayTest)
test_runner = unittest.runner.TextTestRunner(resultclass = GreatAssertionResult)
result = test_runner.run(suite)
result.to_barh() #Also available: result.to_pie()
result.to_results_table()
result.to_full_results_table()
Runnng with XML-Runner
To run with xml-runner, there is no difference to how it’s currently used. However you will not be able to get method like to_results_table as these use a different resultclass
import xmlrunner
suite = unittest.TestLoader().loadTestsFromTestCase(DisplayTest)
test_runner = xmlrunner.XMLRunner(output="test-results")
test_runner.run(suite)
Notes
If you get an arrows function warning when running in Databricks, this will happen becuase a toPandas() method is being used for many of the assertions. The plan is to remove pandas conversion for pure PySpark code. If this is an issue please raise an issue so this method can be prioritised. For now its advisable to makre sure the datasets are not too big, which cause the driver to crash.
Development
To create a development environment, create a virtualenv and make a development installation
virtualenv ve source ve/bin/activate
To run tests, just use pytest
(ve) pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file great-assertions-0.0.59.tar.gz
.
File metadata
- Download URL: great-assertions-0.0.59.tar.gz
- Upload date:
- Size: 19.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee4460665d10ba6eb328011778cbb44e2d8455dfe6a6c0d34e34c1f525801f77 |
|
MD5 | 349f4646069079e59da1273cecf92857 |
|
BLAKE2b-256 | e5254d173f78e8c762954e153476cf958143524d35a3497fe4d874043a2127cf |
File details
Details for the file great_assertions-0.0.59-py3-none-any.whl
.
File metadata
- Download URL: great_assertions-0.0.59-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8817f0e0a02c02b60af720861ae59f3f08764b1ddb25bbf58d507e45fdc7bbc7 |
|
MD5 | b1d025e77e7d671d7fc665754ab6241d |
|
BLAKE2b-256 | c398bd8830c0d6d79990009ca29c357d190a252a36bbcb2d47269e9730515e9a |