Skip to main content

profile tabular datasets, manage automatic validation for new datasets, automatic handling for quality issues.

Project description

qprofiler

qprofiler is a Python package that provides an intelligent way to create a data quality profile for your development(train) dataset(s) and save it as a reference to use in creating quality check tests and automatic handling cases for production(test) datasets.

Table of Contents

Installation

The source code is currently hosted on GitHub at: dprofiler-github

Binary installers for the latest released version are available at the PyPi

# PyPi
pip install qprofiler

Dependencies

  • Polars(>=0.19.0 <0.20.0)
  • PyYAML(>=6.0.1 <7.0.0)
  • Pathlib(>=1.0 <2.0)
  • rumamel.yaml(>=0.17.32 <0.18.0)

Usage

check the notebook that contains everything about how to use DataProfiler module in profiling datasets, and how to use QTest module to create quality check tests.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Licence

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qprofiler-0.2.0.tar.gz (8.4 kB view hashes)

Uploaded Source

Built Distribution

qprofiler-0.2.0-py3-none-any.whl (10.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page