Testframework for PySpark DataFrames
Project description
pyspark-testframework
⏳ Work in progress
The goal of the pyspark-testframework
is to provide a simple way to create tests for PySpark DataFrames. The test results are returned in DataFrame format as well.
Example
Input DataFrame:
primary_key | |
---|---|
1 | info@woonstadrotterdam.nl |
2 | infowoonstadrotterdam.nl |
3 | @woonstadrotterdam.nl |
4 | dev@woonstadrotterdam.nl |
5 | Null |
from testframework.tests import RegexTest
email_regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
mail_tester = RegexTest(
name="ValidEmail",
pattern=email_regex
)
test_result = mail_tester.test(
df=df,
col="email",
nullable=False
)
test_result.show()
primary_key | email__ValidEmail |
---|---|
1 | True |
2 | False |
3 | False |
4 | True |
5 | False |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for pyspark_testframework-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e03ee9ae49d0faab5328c81aab255b69a0d68eca687d0cfe5dd1ad5b58453cd |
|
MD5 | e43693485ce95443ee0f160dd06be06c |
|
BLAKE2b-256 | 079330d10c425856f3feb9f1564647c13f6436a8a0032579bb6164b35b82ef46 |
Close
Hashes for pyspark_testframework-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d2b7bfd7f782dedc9695f58daf2ae2f61f264e998e577dbe77f5a3bdb5a8f47 |
|
MD5 | 8d98ab45e74be49d705a3b0c75a121b3 |
|
BLAKE2b-256 | d50150aa8dc36ef41a400b45b78a08ee946e9f865e3a2ccc0b30dbac7af94d6c |