Skip to main content

Automated EDA, ML readiness scoring, and data quality checks — locally in your browser.

Project description

Athena — ML Diagnostics Platform

Upload a CSV. Get a full diagnostic report. Download a preprocessing script.

Athena is a machine learning readiness tool that analyzes tabular datasets and surfaces what matters before you start modeling — leakage risks, class imbalance, outliers, feature redundancy, NLP readiness, and more.


How It Works

  1. Upload a CSV — up to 200k rows, labeled or unlabeled
  2. Select Supervised or Unsupervised mode. For supervised, specify the target column
  3. Choose an analysis profile: Standard, Finance, Healthcare, or NLP
  4. Athena runs a full diagnostic pass and returns a structured report
  5. Explore results across five tabs: Overview, Features, Quality, ML Diagnostics, and NLP
  6. Download a ready-to-run Python preprocessing script tailored to your dataset, or export the full report as HTML

Features

  • ML Readiness Score — composite score across data health, trainability, and leakage risk
  • Leakage Detection — flags identifier columns and suspiciously correlated features
  • Baseline Probe — 3-fold cross-validated LightGBM baseline with learning curves
  • Outlier and Skew Analysis — per-feature skewness, kurtosis, log-transform recommendations
  • NLP Readiness — detects free-text columns, vocabulary analysis, embedding recommendations
  • Drift Comparison — upload train and test splits to get per-column distribution drift scores
  • Feature Redundancy — correlation matrix analysis for highly redundant feature pairs
  • Analysis Profiles — Standard, Finance, Healthcare, NLP
  • Preprocessing Script — one-click export of a production-ready Python preprocessing script

Stack

  • Frontend — React, TypeScript, Vite, Recharts
  • Backend — FastAPI, Python, LightGBM, scikit-learn, pandas

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

athena_eda-0.2.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

athena_eda-0.2.0-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file athena_eda-0.2.0.tar.gz.

File metadata

  • Download URL: athena_eda-0.2.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e019c3bac8fc569f6fb3fb190c1138a5c75a6bccff3bddb24c3898a60a453bcf
MD5 e08ceee6f69708414e5a297c41cd0760
BLAKE2b-256 0d23ccc48b7d15888ddcf6b97cd08e50f28067d57dd266b0d0b70ceed454388e

See more details on using hashes here.

File details

Details for the file athena_eda-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: athena_eda-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 229c27432a36e3d103fdf4c5ee49090a8d0be570b05c2f3e86ebdd1ea47f0653
MD5 fb2d02413b015c49e696da9b4d84ed45
BLAKE2b-256 344ee6585e4c82f9a14241cc7533b917cdeb399238ef574c9b58d14f2a3496d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page