Data quality inspection tool. Identify issues before your CTO detects them!
Project description
data_watchtower
数据监控校验工具
在你的CTO发现问题前, 发现问题
安装
pip install data-watchtower
数据加载器
加载数据到内存中,供校验器使用
校验器
校验加载器加载的数据是否符合预期
内置加载器
- ExpectColumnValuesToNotBeNull
- ExpectColumnRecentlyUpdated
- ExpectColumnStdToBeBetween
- ExpectColumnMeanToBeBetween
- ExpectColumnNullRatioToBeBetween
- ExpectRowCountToBeBetween
- ExpectColumnDistinctValuesToContainSet
- ExpectColumnDistinctValuesToEqualSet
- ExpectColumnValuesToNotBeNull
- ExpectColumnDistinctValuesToBeInSet
自定义加载器
。。。
宏
通过自定义宏, 可以在监控项中引用一些自定义的变量, 比如日期, 配置文件等
生效范围
- Watchtower的名称
- 校验器的参数
- 数据加载器的参数
自定义宏
。。。
支持的数据库
- MySQL
- Postgresql
- SQLite
- ...
示例
import datetime
from data_watchtower import (DbServices, Watchtower, DatabaseLoader,
ExpectRowCountToBeBetween, ExpectColumnValuesToNotBeNull)
dw_test_data_db_url = "sqlite:///test.db"
dw_backend_db_url = "sqlite:///data.db"
# 自定义宏模板
custom_macro_map = {
'today': {'impl': lambda: datetime.datetime.today().strftime("%Y-%m-%d")},
'start_date': '2024-04-01',
'column': 'name',
}
# 设置数据加载器,用来加载需要校验的数据
query = "SELECT * FROM score where date='${today}'"
data_loader = DatabaseLoader(query=query, connection=dw_test_data_db_url)
data_loader.load()
# 创建监控项
wt = Watchtower(name='score of ${today}', data_loader=data_loader, custom_macro_map=custom_macro_map)
# 添加校验器
params = ExpectRowCountToBeBetween.Params(min_value=20, max_value=None)
wt.add_validator(ExpectRowCountToBeBetween(params))
params = ExpectColumnValuesToNotBeNull.Params(column='${column}')
wt.add_validator(ExpectColumnValuesToNotBeNull(params))
result = wt.run()
print(result['success'])
# 保存监控配置以及监控结果
db_svr = DbServices(dw_backend_db_url)
# 创建表
db_svr.create_tables()
# 保存监控配置
db_svr.add_watchtower(wt)
# 保存监控结果
db_svr.save_result(wt, result)
# 重新计算监控项的成功状态
db_svr.update_watchtower_success_status(wt)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data_watchtower-0.0.5.tar.gz
(14.4 kB
view details)
Built Distribution
File details
Details for the file data_watchtower-0.0.5.tar.gz
.
File metadata
- Download URL: data_watchtower-0.0.5.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.9.13 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58abce16f1b3d8bfedb14e57dcee6cebb0dc0faae16526c084190042fd2859db |
|
MD5 | ef6fac54f17dd9dd2e62cfaa3d5759eb |
|
BLAKE2b-256 | b8fa917eae7dfa06263c614f6219cd4ac2f202d1938b9e6c07b7ef8035c2aa67 |
File details
Details for the file data_watchtower-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: data_watchtower-0.0.5-py3-none-any.whl
- Upload date:
- Size: 19.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.9.13 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1db11a06ab776476d1779e9ebf04e1d2ecce252dd1e7255c3c0d8e0017cde9de |
|
MD5 | e62f1459bfaf31f1e37270f0271467c4 |
|
BLAKE2b-256 | 86afa20c7a130abf8b42b149cada25455853839c254872c008301599e91aebd8 |