Skip to main content

A project to ensure the data quality using python

Project description

PyQualitas

This project aims towards developing a python library ensure quality of the data. This project is an inspiration from deequ and dataflare which are also aimed towards the quality of the data.

Requirements:

  1. Pyspark - Version 3.3.0
  2. Pandas - Version 1.5.0
  3. Jinja2 - Version 3.1.2
  4. Slack-SDK - Version 3.19.3
  5. PyMSTeams - Version 0.2.2

Installation:

The package can be installed as follows:

"pip install pyQualitas"

The test version of this package can be installed as follows:

"pip install -i https://test.pypi.org/simple/ pyQualitas"

Use Cases:

The main agenda behind creating this library is to help the QA Engineers to ensure quality of the data. Given the volume of the data & the frequency of the releases happening in the industry, there is an enormous responsibility on the Quality Assurance team to ensure & sign-off the quality of the data generated by the application.

It is very hard to achieve this using manual testing and scheduling an automated validation helps achieve the timelines and ensure a high quality of the data with less efforts.

There are various tests in this library that would come in handy during the regression testing process. Since the project is implemented in Python, the learning curve is short when compared to the libraries that are available in Scala.

The documentation can be found in the following link:

https://github.com/IamVenkatesh/pyQualitas/wiki

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyqualitas-2.0.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyqualitas-2.0.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file pyqualitas-2.0.0.tar.gz.

File metadata

  • Download URL: pyqualitas-2.0.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pyqualitas-2.0.0.tar.gz
Algorithm Hash digest
SHA256 2c3d675aaf9f4264293296da9de7db36bbdef34461a9934da042d2d4e731f197
MD5 2db8a12a0cfaa9bb7af1d4fab72e0959
BLAKE2b-256 0f3b92b847af611f37682e6863de989d92943ca86128c8c9010b812f14e87413

See more details on using hashes here.

File details

Details for the file pyqualitas-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyqualitas-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pyqualitas-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 87a8fced3c9e59085c59d208724e66c2e5214efdf133243e4756b1fefb68b21d
MD5 6f6596eb904f67119a70b4fb5ac48b9a
BLAKE2b-256 538c5acb2e2bf77e99fc2c40830c8a4b2bb99c515ef23dd07749bfa1c51a4ab4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page