Skip to main content

Create data quality rules and apply them to datasets.

Project description

DQAI (Data Quality Artificial Intelligence)

This code provides a Python class called DQAI that utilizes the OpenAI Chat API to analyze a dataset and generate data quality rules specific to the data.

Usage

  1. Install the necessary dependencies.
  2. Set up your OpenAI API key or use the provided default key.
  3. Prepare your dataset in a suitable format (e.g., CSV).
  4. Instantiate the DQAI class.
  5. Invoke the invoke_from_dataset method, passing the dataset as input.
  6. The code will generate Python code based on the dataset and execute it.
  7. The generated rules and the results will be saved in the current directory as "generated_code.py" and "rulesapplication.csv," respectively.
  8. The generated rules can be obtained by calling the _get_rules_from_file method.

Example:

import pandas as pd
from dqai import DQAI

Read the dataset from a CSV file

path = "path/to/your/dataset.csv"
data = pd.read_csv(path)

Instantiate DQAI and generate data quality rules

key = OPEN_AI_KEY
dqai = DQAI(key)
result = dqai.invoke_from_dataset(data)

Access the generated rules and results

rules = result["0"]
results_df = result["1"]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dq_ai_module-1.1.0.tar.gz (3.0 kB view hashes)

Uploaded Source

Built Distribution

dq_ai_module-1.1.0-py3-none-any.whl (3.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page