Skip to main content

Testframework for PySpark DataFrames

Project description

Build Status Version Ruff

pyspark-testframework

Work in progress








The goal of the pyspark-testframework is to provide a simple way to create tests for PySpark DataFrames. The test results are returned in DataFrame format as well.

Example

Input DataFrame:

primary_key email
1 info@woonstadrotterdam.nl
2 infowoonstadrotterdam.nl
3 @woonstadrotterdam.nl
4 dev@woonstadrotterdam.nl
5 Null
from testframework.tests import RegexTest

email_regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"

mail_tester = RegexTest(
    name="ValidEmail",
    pattern=email_regex
)

test_result = mail_tester.test(
    df=df,
    col="email",
    nullable=False
)

test_result.show()
primary_key email__ValidEmail
1 True
2 False
3 False
4 True
5 False

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_testframework-0.2.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

pyspark_testframework-0.2.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_testframework-0.2.0.tar.gz.

File metadata

  • Download URL: pyspark_testframework-0.2.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for pyspark_testframework-0.2.0.tar.gz
Algorithm Hash digest
SHA256 16a5f44dff635f14125fededbc09d416ed574c787f73f89f8991655d19e51b4d
MD5 1a7b9a44e994eb1abfd92a40cf27e3ff
BLAKE2b-256 c2ebfbafd62781c841bd280bee8790bff5a8d6cf45dd5de941f75e5ba81187cb

See more details on using hashes here.

File details

Details for the file pyspark_testframework-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_testframework-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 566c4ef39f8c01ab3b6ed03c5f376d53c0b424836fe9d0ef7369b2dd3244240c
MD5 9e0724f04f19db8e4f83da58493789e5
BLAKE2b-256 aa198a2b14bcf8173d2aebe969a3e1bf8421d77522f1f3d7fae3fdeeb6dc0a0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page