Skip to main content

Automated EDA, ML readiness scoring, and data quality checks — locally in your browser.

Project description

Athena — ML Diagnostics Platform

Upload a CSV. Get a full diagnostic report. Download a preprocessing script.

Athena is a machine learning readiness tool that analyzes tabular datasets and surfaces what matters before you start modeling — leakage risks, class imbalance, outliers, feature redundancy, NLP readiness, and more.


How It Works

  1. Upload a CSV — up to 200k rows, labeled or unlabeled
  2. Select Supervised or Unsupervised mode. For supervised, specify the target column
  3. Choose an analysis profile: Standard, Finance, Healthcare, or NLP
  4. Athena runs a full diagnostic pass and returns a structured report
  5. Explore results across five tabs: Overview, Features, Quality, ML Diagnostics, and NLP
  6. Download a ready-to-run Python preprocessing script tailored to your dataset, or export the full report as HTML

Features

  • ML Readiness Score — composite score across data health, trainability, and leakage risk
  • Leakage Detection — flags identifier columns and suspiciously correlated features
  • Baseline Probe — 3-fold cross-validated LightGBM baseline with learning curves
  • Outlier and Skew Analysis — per-feature skewness, kurtosis, log-transform recommendations
  • NLP Readiness — detects free-text columns, vocabulary analysis, embedding recommendations
  • Drift Comparison — upload train and test splits to get per-column distribution drift scores
  • Feature Redundancy — correlation matrix analysis for highly redundant feature pairs
  • Analysis Profiles — Standard, Finance, Healthcare, NLP
  • Preprocessing Script — one-click export of a production-ready Python preprocessing script

Stack

  • Frontend — React, TypeScript, Vite, Recharts
  • Backend — FastAPI, Python, LightGBM, scikit-learn, pandas

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

athena_eda-0.2.1.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

athena_eda-0.2.1-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file athena_eda-0.2.1.tar.gz.

File metadata

  • Download URL: athena_eda-0.2.1.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.2.1.tar.gz
Algorithm Hash digest
SHA256 883289176c0836c03a7e0918da79986bd9cb76063f4abae7ceae66968efccb71
MD5 461e15690e0b4929b906831c5648cd06
BLAKE2b-256 17c68864bc6db9a2f94af8248c4713081a892adb28ef3af0391f1af57b56b223

See more details on using hashes here.

File details

Details for the file athena_eda-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: athena_eda-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bdc397b8cb591702ed0f81eedc65222757a83ac038be03c50255c300eb5c3841
MD5 31f68aa0d954dfcc8a68e7483e646411
BLAKE2b-256 17d0769ff9e7933b6f4b7d3bc1183cb94b60cc802ab2c0609ea520a8f67aefd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page