Testframework for PySpark DataFrames
Project description
pyspark-testframework
⏳ Work in progress
The goal of the pyspark-testframework
is to provide a simple way to create tests for PySpark DataFrames. The test results are returned in DataFrame format as well.
Example
Input DataFrame:
primary_key | |
---|---|
1 | info@woonstadrotterdam.nl |
2 | infowoonstadrotterdam.nl |
3 | @woonstadrotterdam.nl |
4 | dev@woonstadrotterdam.nl |
5 | Null |
from testframework.tests import RegexTest
email_regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
mail_tester = RegexTest(
name="ValidEmail",
pattern=email_regex
)
test_result = mail_tester.test(
df=df,
col="email",
nullable=False
)
test_result.show()
primary_key | email__ValidEmail |
---|---|
1 | True |
2 | False |
3 | False |
4 | True |
5 | False |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for pyspark_testframework-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16a5f44dff635f14125fededbc09d416ed574c787f73f89f8991655d19e51b4d |
|
MD5 | 1a7b9a44e994eb1abfd92a40cf27e3ff |
|
BLAKE2b-256 | c2ebfbafd62781c841bd280bee8790bff5a8d6cf45dd5de941f75e5ba81187cb |
Close
Hashes for pyspark_testframework-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 566c4ef39f8c01ab3b6ed03c5f376d53c0b424836fe9d0ef7369b2dd3244240c |
|
MD5 | 9e0724f04f19db8e4f83da58493789e5 |
|
BLAKE2b-256 | aa198a2b14bcf8173d2aebe969a3e1bf8421d77522f1f3d7fae3fdeeb6dc0a0a |