Data Quality powered by AI
Project description
Weiser
Data Quality Framework
Introduction
Weiser is a data quality framework designed to help you ensure the integrity and accuracy of your data. It provides a set of tools and checks to validate your data and detect anomalies. It also includes a dashboard to visualize the results of the checks.
Installation
To install Weiser, use the following command:
pip install weiser-ai
Usage
Run example checks
Connections are defined at the datasources section in the config file see: examples/example.yaml
.
Run checks in verbose mode:
weiser run examples/example.yaml -v
Compile checks only in verbose mode:
weiser compile examples/example.yaml -v
Run dashboard
cd weiser-ui
pip install -r requirements.txt
streamlit run app.py
Configuration
Simple count check defintion
- name: test row_count
dataset: orders
type: row_count
condition: gt
threshold: 0
Custom sql definition
- name: test numeric
dataset: orders
type: numeric
measure: sum(budgeted_amount::numeric::float)
condition: gt
threshold: 0
Target multiple datasets with the same check definition
- name: test row_count
dataset: [orders, vendors]
type: row_count
condition: gt
threshold: 0
Check individual group by values in a check
- name: test row_count groupby
dataset: vendors
type: row_count
dimensions:
- tenant_id
condition: gt
threshold: 0
Time aggregation check with granularity
- name: test numeric gt sum yearly
dataset: orders
type: sum
measure: budgeted_amount::numeric::float
condition: gt
threshold: 0
time_dimension:
name: _updated_at
granularity: year
Custom SQL expression for dataset and filter usage
- name: test numeric completed
dataset: >
SELECT * FROM orders o LEFT JOIN orders_status os ON o.order_id = os.order_id
type: numeric
measure: sum(budgeted_amount::numeric::float)
condition: gt
threshold: 0
filter: status = 'FULFILLED'
Anomaly detection check
- name: test anomaly
# anomaly test should always target metrics metadata dataset
dataset: metrics
type: anomaly
# References Orders row count.
check_id: c5cee10898e30edd1c0dde3f24966b4c47890fcf247e5b630c2c156f7ac7ba22
condition: between
# long tails of normal distribution for Z-score.
threshold: [-3.5, 3.5]
Contributing
We welcome contributions!
License
This project is licensed under the Apache 2.0 License. See the LICENSE
file for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file weiser_ai-0.1.6.tar.gz
.
File metadata
- Download URL: weiser_ai-0.1.6.tar.gz
- Upload date:
- Size: 17.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.12.4 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 539e054bf3d1c9b0f149228799b288f0d3100b53bfb34d17dadaf5be9faa62d1 |
|
MD5 | 03642d96a32b15c5db4a6293154ec872 |
|
BLAKE2b-256 | c6e5dc7f52192865910f4b98ae040f218bd37f90a619e037b553a8c0c00b0b94 |
File details
Details for the file weiser_ai-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: weiser_ai-0.1.6-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.12.4 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac7a1d8d11d425adb4e5478699086de934fe97dac548c64c94ef6a4c9e5b0947 |
|
MD5 | b44e6c2edfb1ddf84e0d13e1e36eda3c |
|
BLAKE2b-256 | d9a13d29fd10625549f341d1daae1b6eb499476772a94eb7884ce5608f92c179 |