Python library designed to compute data quality metrics for Machine Learning
Project description
DQM-ML CLI Wrapper
Main CLI entry point for DQM-ML. Consolidates all modular packages into a single command-line interface.
Installation
# Basic installation (core only)
pip install dqm-ml
# Installation with optional components
pip install "dqm-ml[all]" # Everything
pip install "dqm-ml[job]" # core + job
pip install "dqm-ml[pytorch]" # core + pytorch
pip install "dqm-ml[images]" # core + images
pip install "dqm-ml[notebooks]" # Jupyter support
Quick Start
Process a Dataset
Run a data quality pipeline from a configuration file:
dqm-ml process -p config.yaml
List Available Plugins
Show all registered metrics and data loaders:
dqm-ml list
Check Version
dqm-ml version
Commands
| Command | Description |
|---|---|
| process | Execute a data quality pipeline from a YAML config |
| list | Show all available plugins (metrics, loaders) |
| version | Display version information |
Configuration
DQM-ML uses YAML configuration files to define:
- Data sources (dataloaders)
- Metrics to compute (metrics_processor)
- Output settings (outputs)
Completeness Example
dataloaders:
train:
type: parquet
path: data/train.parquet
metrics_processor:
completeness:
type: completeness
input_columns: [col_a, col_b]
Representativeness Example
dataloaders:
train:
type: parquet
path: data/train.parquet
metrics_processor:
representativeness:
type: representativeness
input_columns: [feature_x, feature_y]
distribution: "normal"
metrics: ["chi-square", "kolmogorov-smirnov"]
Domain Gap Example
dataloaders:
source:
type: parquet
path: data/source.parquet
target:
type: parquet
path: data/target.parquet
metrics_processor:
domain_gap:
type: domain_gap
INPUT:
embedding_col: "features"
DELTA:
metric: "mmd_linear"
Visual Features Example
dataloaders:
images:
type: parquet
path: data/images.parquet
metrics_processor:
visual:
type: visual_metric
input_columns: ["image_data"]
grayscale: true
Multiple Metrics Example
dataloaders:
train:
type: parquet
path: data/train.parquet
metrics_processor:
completeness:
type: completeness
input_columns: [col_a, col_b]
representativeness:
type: representativeness
input_columns: [feature_x]
distribution: "normal"
See Also
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dqm_ml-2.0.0rc0.tar.gz.
File metadata
- Download URL: dqm_ml-2.0.0rc0.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.0 {"installer":{"name":"uv","version":"0.11.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"26.04","id":"resolute","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a05e1ec52e2d02581ec44ed3a06fa3d6216c9de06a8bc5299b2fc71860883683
|
|
| MD5 |
4036b9ee903dcdcd854a82d94fc54566
|
|
| BLAKE2b-256 |
f8ef794dc1b33fbd4a3c96a5ba0fb0d5cf854961c57d274cd09d011611791c8b
|
File details
Details for the file dqm_ml-2.0.0rc0-py3-none-any.whl.
File metadata
- Download URL: dqm_ml-2.0.0rc0-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.0 {"installer":{"name":"uv","version":"0.11.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"26.04","id":"resolute","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a40bbe4b8c232c6311657ad323453303dd4ef93da24db8136fc220633dbab213
|
|
| MD5 |
44e7b82b94cc756784220645eb3683dc
|
|
| BLAKE2b-256 |
83cf2299c034da3e09481633054c2d9934d493e27b6f98a86dd50b7d8dac6e7e
|