Skip to main content

Testframework for PySpark DataFrames

Project description

Ruff MIT License Build Status

pyspark-testframework

Work in progress








The goal of the pyspark-testframework is to provide a simple way to create tests for PySpark DataFrames. The test results are returned in DataFrame format as well.

Example

Input DataFrame:

primary_key email
1 info@woonstadrotterdam.nl
2 infowoonstadrotterdam.nl
3 @woonstadrotterdam.nl
4 dev@woonstadrotterdam.nl
5 Null
from testframework.tests import RegexTest

email_regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

mail_tester = RegexTest(
    name="ValidEmail",
    pattern=email_regex
)

test_result = mail_tester.test(
    df=df,
    col="email",
    nullable=False
)

test_result.show()
primary_key email__ValidEmail
1 True
2 False
3 False
4 True
5 False

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_testframework-0.1.0.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

pyspark_testframework-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_testframework-0.1.0.tar.gz.

File metadata

  • Download URL: pyspark_testframework-0.1.0.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for pyspark_testframework-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7e03ee9ae49d0faab5328c81aab255b69a0d68eca687d0cfe5dd1ad5b58453cd
MD5 e43693485ce95443ee0f160dd06be06c
BLAKE2b-256 079330d10c425856f3feb9f1564647c13f6436a8a0032579bb6164b35b82ef46

See more details on using hashes here.

File details

Details for the file pyspark_testframework-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_testframework-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d2b7bfd7f782dedc9695f58daf2ae2f61f264e998e577dbe77f5a3bdb5a8f47
MD5 8d98ab45e74be49d705a3b0c75a121b3
BLAKE2b-256 d50150aa8dc36ef41a400b45b78a08ee946e9f865e3a2ccc0b30dbac7af94d6c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page