Skip to main content

Data quality inspection tool. Identify issues before your CTO detects them!

Project description

data_watchtower

数据监控校验工具

在你的CTO发现问题前, 发现问题

安装

pip install data-watchtower

数据加载器

加载数据到内存中,供校验器使用

校验器

校验加载器加载的数据是否符合预期

内置加载器

  • ExpectColumnValuesToNotBeNull
  • ExpectColumnRecentlyUpdated
  • ExpectColumnStdToBeBetween
  • ExpectColumnMeanToBeBetween
  • ExpectColumnNullRatioToBeBetween
  • ExpectRowCountToBeBetween
  • ExpectColumnDistinctValuesToContainSet
  • ExpectColumnDistinctValuesToEqualSet
  • ExpectColumnValuesToNotBeNull
  • ExpectColumnDistinctValuesToBeInSet

自定义加载器

。。。

通过自定义宏, 可以在监控项中引用一些自定义的变量, 比如日期, 配置文件等

生效范围

  • Watchtower的名称
  • 校验器的参数
  • 数据加载器的参数

自定义宏

。。。

支持的数据库

  • MySQL
  • Postgresql
  • SQLite
  • ...

示例

import datetime
from data_watchtower import (DbServices, Watchtower, DatabaseLoader,
                             ExpectRowCountToBeBetween, ExpectColumnValuesToNotBeNull)

dw_test_data_db_url = "sqlite:///test.db"
dw_backend_db_url = "sqlite:///data.db"

# 自定义宏模板
custom_macro_map = {
    'today': {'impl': lambda: datetime.datetime.today().strftime("%Y-%m-%d")},
    'start_date': '2024-04-01',
    'column': 'name',
}
# 设置数据加载器,用来加载需要校验的数据
query = "SELECT * FROM score where date='${today}'"
data_loader = DatabaseLoader(query=query, connection=dw_test_data_db_url)
data_loader.load()
# 创建监控项
wt = Watchtower(name='score of ${today}', data_loader=data_loader, custom_macro_map=custom_macro_map)
# 添加校验器
params = ExpectRowCountToBeBetween.Params(min_value=20, max_value=None)
wt.add_validator(ExpectRowCountToBeBetween(params))

params = ExpectColumnValuesToNotBeNull.Params(column='${column}')
wt.add_validator(ExpectColumnValuesToNotBeNull(params))

result = wt.run()
print(result['success'])

# 保存监控配置以及监控结果
db_svr = DbServices(dw_backend_db_url)
# 创建表
db_svr.create_tables()
# 保存监控配置
db_svr.add_watchtower(wt)
# 保存监控结果
db_svr.save_result(wt, result)
# 重新计算监控项的成功状态
db_svr.update_watchtower_success_status(wt)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_watchtower-0.0.5.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

data_watchtower-0.0.5-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file data_watchtower-0.0.5.tar.gz.

File metadata

  • Download URL: data_watchtower-0.0.5.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.9.13 Windows/10

File hashes

Hashes for data_watchtower-0.0.5.tar.gz
Algorithm Hash digest
SHA256 58abce16f1b3d8bfedb14e57dcee6cebb0dc0faae16526c084190042fd2859db
MD5 ef6fac54f17dd9dd2e62cfaa3d5759eb
BLAKE2b-256 b8fa917eae7dfa06263c614f6219cd4ac2f202d1938b9e6c07b7ef8035c2aa67

See more details on using hashes here.

File details

Details for the file data_watchtower-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: data_watchtower-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.9.13 Windows/10

File hashes

Hashes for data_watchtower-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1db11a06ab776476d1779e9ebf04e1d2ecce252dd1e7255c3c0d8e0017cde9de
MD5 e62f1459bfaf31f1e37270f0271467c4
BLAKE2b-256 86afa20c7a130abf8b42b149cada25455853839c254872c008301599e91aebd8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page