Skip to main content

Validate MIP data-model folders using DuckDB.

Project description

data-validator

data-validator validates MIP data-model folders using DuckDB.

Standalone package setup

From the repository root:

cd data-validator/mip-data-validator
poetry install

Command

poetry run data-validator validate-data-model /path/to/data_model_folder

Run as a Python module:

poetry run python -m data_validator validate-data-model /path/to/data_model_folder

Optional threads:

poetry run data-validator validate-data-model /path/to/data_model_folder --threads 8

Collect all errors and emit NDJSON:

poetry run data-validator validate-data-model /path/to/data_model_folder --report-all --format ndjson

Write HTML report to a file:

poetry run data-validator validate-data-model /path/to/data_model_folder --report-all --format html --output report.html

If --format html is used without --output, the report is automatically written under /tmp and the path is printed.

Folder Layout

/path/to/data_model_folder/
  CDEsMetadata.json
  dataset1.csv
  dataset2.csv

Validation Notes

  • CSV validation queries files directly with DuckDB and uses fused aggregate checks to reduce scan overhead.
  • Folder-level dataset uniqueness is enforced across all CSV files via SQL using normalized codes (trim + lower).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mip_data_validator-0.0.1.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mip_data_validator-0.0.1-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file mip_data_validator-0.0.1.tar.gz.

File metadata

  • Download URL: mip_data_validator-0.0.1.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.8.0-101-generic

File hashes

Hashes for mip_data_validator-0.0.1.tar.gz
Algorithm Hash digest
SHA256 85c006a5a6abc220ffe6e93b9abed62c106e6bcef1744521706222a220dbafc8
MD5 dff9bb30f25754a18d79fef1913f4405
BLAKE2b-256 cf9635af2728863e765788a00805b1be5f0912c1d43ada02186543036731a42d

See more details on using hashes here.

File details

Details for the file mip_data_validator-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: mip_data_validator-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.8.0-101-generic

File hashes

Hashes for mip_data_validator-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3619a10ce38d6e941cb0060339fb4fa6f6a8d6f5d44fe951370fd40ee7ec10f7
MD5 51ad2b8624190bb407454fbc73c0b7aa
BLAKE2b-256 70cb91baff3f2ad5f760358b13cfb104c2efe2ed0f2f14d1e07935767370f853

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page