Pure Data Framework
Project description
Pure Data
Developed by students of the Simulator ML (Karpov.Courses)
Pure Data is a tool designed to help organize data quality checks in your projects. You simply define the data you want to test, the list of test metrics and success criteria, run the test, and get a report with the results.
The Pure Data includes:
- a list of different metrics that you can use to check the accuracy of the data;
- Report class, with which you can iterate through a list of metrics and get some summary information about which metrics pass, fail, or drop with errors.
How to install
pip install pure-data
Key Functionality
There are plenty of metrics that you can use to control your data's accuracy and reliability.
You can either just apply the metrics you need to your data or use the Report class to create a checklist with metrics you'd like to check and get summary information about the metrics results.
Usage
Below is a brief example of how you can use Pure to verify your data.
Import Report class and metrics from which you can use any metrics you need.
from pure.report import Report
import pure.metrics as m
Firstly, initialize tables with names and data you want to work with, and create a checklist with metrics.
Metric returns a dict with some meta fields. In the checklist, you can specify which metric result fields you want to control within certain limits. In this example, we will determine limits for the "total" field in the first case and the "delta" field in the second one.
tables = {"simple_table": data}
checklist = [
("simple_table", m.CountTotal(), {"total": (1, 1e6)}),
("simple_table", m.CountZeros("column_1"), {"delta": (0, 0.3)})
]
Then you can use Report just as follows
report = Report(tables=tables, checklist=checklist, engine='pandas')
Example of the report resulting dataframe:
There is a more detailed example where the key functionality of the package is presented:
https://github.com/uberkinder/Pure-Data/blob/usage_example/examples/simple_example.ipynb
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pure-data-0.1.6.6.tar.gz.
File metadata
- Download URL: pure-data-0.1.6.6.tar.gz
- Upload date:
- Size: 24.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0ed6fbfc55f44b8c953703171cec4430fa7cb8f59f86963c9a468264295713f
|
|
| MD5 |
40774ce14f3167ae0463833e216f5981
|
|
| BLAKE2b-256 |
33b7a1d4c1a875ed6ff53691ac3a80abe9a48ca31c1c10678c27dbaf90e5f205
|