Skip to main content

Python library designed provide pipelining tools dqm-ml library to compute data quality metrics for Machine Learning

Project description

DQM-ML Job

This package provides the orchestration engine for DQM-ML V2. It handles the lifecycle of data processing, from loading to metric computation and output writing.

Key Components

DatasetPipeline

The main orchestrator that:

  • Loads the configuration.
  • Discovers plugins via entry points.
  • Executes the streaming loop.
  • Manages memory and I/O efficiency.

Protocols

  • DataLoader: A factory for creating data selections (e.g., Parquet, CSV loaders).
  • DataSelection: Represents a specific subset of data and provides an iterator over batches.
  • OutputWriter: Persists computed features or metrics to disk.

Built-in Loaders

  • parquet: Optimized loading using PyArrow.
  • csv: Flexible loading using Pandas.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dqm_ml_job-1.1.6.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dqm_ml_job-1.1.6-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file dqm_ml_job-1.1.6.tar.gz.

File metadata

  • Download URL: dqm_ml_job-1.1.6.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dqm_ml_job-1.1.6.tar.gz
Algorithm Hash digest
SHA256 b7fc58129f087ed762d2ae12bb31d1cfda6c7a52287d9d8a00b43b7914fe1457
MD5 b100653c88d1df45f073d67e2f439ac1
BLAKE2b-256 34aa2ad4821d23be849fb49d27175b2a95b213d1b7c764e039b34ba86ec9ae95

See more details on using hashes here.

File details

Details for the file dqm_ml_job-1.1.6-py3-none-any.whl.

File metadata

  • Download URL: dqm_ml_job-1.1.6-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dqm_ml_job-1.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b29862f544e752967f1766513d992ff554288cbf41306744ec944b077f4ce70e
MD5 f449752a01a9ffa51096e789541987cf
BLAKE2b-256 3f8fa8841b9837d7eaf50707c3d74a31d4dbc96036a62011963d7080080cae11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page