Skip to main content

A dq framework

Project description

dqframework is a Python package that provides a framework for data quality assessment and monitoring.

This package is designed with polars in mind, but in the future it might get expanded.

Another framework?

This framework is designed to be used in a similar way to many others, but it has been designed with some goals in mind:

  • Observability: The framework should provide a way to monitor the data quality of a dataset in a way that is easy to understand and to act upon.
  • Extensibility: The framework should be easy to extend with new checks and new ways to monitor the data quality.
  • Performance: The framework should be able to handle large datasets and provide a way to monitor the data quality of these datasets in a performant way.
  • Ease of use: The framework should be easy to use and to understand, so that it can be used by data engineers, data scientists, and other data professionals.
  • Monitoring and Reporting: The framework should provide a way to monitor the data quality of a dataset over time and to report on the data quality of the dataset.

How to use it?

This framework is centered around three main concepts:

  • Pipeline: A pipeline is an object that comprises multiple Checks, and is responsible for running these checks on a dataframe.
  • Check: A check is an object that comprises multiple Validators, and is responsible for checking a certain set of properties of a dataframe. It has severity levels, and can be used to monitor the data quality of a dataset.
  • Validator: A validator is a function that is responsible for validating a certain property of a dataframe

Installation

To install the package, you can use pip:

pip install dqframework

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dqframework-0.6.1.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dqframework-0.6.1-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file dqframework-0.6.1.tar.gz.

File metadata

  • Download URL: dqframework-0.6.1.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for dqframework-0.6.1.tar.gz
Algorithm Hash digest
SHA256 46bf0f148b2996c2ffd8c8b3206d5ce933602fd990cc2508a3a4b46865434e85
MD5 8c381ac918e34f532a9cad8a5823811f
BLAKE2b-256 5287b08edd7db973e527fdf45fe37736c302a1a10144537654d3e7d90160633a

See more details on using hashes here.

File details

Details for the file dqframework-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: dqframework-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for dqframework-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0c4e3865c6339623e5503cf058a84c412cfc7e1f4381e0f83bf9f710ee2cba35
MD5 7ce0695616b1324ba75481a499689010
BLAKE2b-256 f75a9316c4192ded20564b136dbe85e9e946e5169f52d494c69f16d4a416803e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page