Skip to main content

ML readiness scoring for tabular datasets

Project description

datascore

ML readiness scoring for tabular datasets.

Point it at a DataFrame and get a structured report telling you whether your data is ready for ML training — and if not, exactly why.

Install

pip install datascore

Usage

from datascore import score

report = score(df, target="churn") report.show()

Output

datascore Report

Rows: 7043 | Features: 21 | Target: Churn Score: 85/100 — READY

WARNINGS

  • High cardinality: customerID has 7043 unique values
  • High cardinality: TotalCharges has 6531 unique values
  • High skew in SeniorCitizen: 1.8332

INFO

  • No constant features detected
  • No infinite values detected
  • Class balance: 73/27

What it checks

Category Checks
Completeness Missing values, high missing rate per column
Integrity Duplicate rows, constant features, infinite values
ML Readiness Class imbalance, target leakage risk, high cardinality
Distribution Skew, outliers per column

Scoring

Starts at 100. Each blocker deducts 15 points, each warning deducts 5.

Score Verdict
80-100 READY
50-79 NEEDS WORK
0-49 NOT READY

Why not Great Expectations or Pandera?

Those tools validate data against rules you define. datascore tells you what the problems are without you having to know what to look for first. Assessment, not validation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datascore-0.1.0.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datascore-0.1.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file datascore-0.1.0.tar.gz.

File metadata

  • Download URL: datascore-0.1.0.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for datascore-0.1.0.tar.gz
Algorithm Hash digest
SHA256 206fc95fce669cafd3469c73f41de0fcd7de5e41cce0e9c9aea503f63a51b71b
MD5 bc8a1fd08d820870adeb9ed6ac415f13
BLAKE2b-256 fa896a46e92427de0675aec45cb45853821e7c665479314e49e255a505e70b60

See more details on using hashes here.

File details

Details for the file datascore-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: datascore-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for datascore-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 784728f10011f988e1bb02c5151864b21624c974f8fc6bb7891b0f199dc9e80a
MD5 9e57a2ae02e93ced0dddd7bbe19e142c
BLAKE2b-256 585cbe9572900f7c13b7d007c4f4ae961b812837babc20a4b1f36fa613ed4e47

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page