Skip to main content

Automated EDA, ML readiness scoring, and data quality checks — locally in your browser.

Project description

Athena — ML Diagnostics Platform

Upload a CSV. Get a full diagnostic report. Download a preprocessing script.

Athena is a machine learning readiness tool that analyzes tabular datasets and surfaces what matters before you start modeling — leakage risks, class imbalance, outliers, feature redundancy, NLP readiness, and more.


How It Works

  1. Upload a CSV — up to 200k rows, labeled or unlabeled
  2. Select Supervised or Unsupervised mode. For supervised, specify the target column
  3. Choose an analysis profile: Standard, Finance, Healthcare, or NLP
  4. Athena runs a full diagnostic pass and returns a structured report
  5. Explore results across five tabs: Overview, Features, Quality, ML Diagnostics, and NLP
  6. Download a ready-to-run Python preprocessing script tailored to your dataset, or export the full report as HTML

Features

  • ML Readiness Score — composite score across data health, trainability, and leakage risk
  • Leakage Detection — flags identifier columns and suspiciously correlated features
  • Baseline Probe — 3-fold cross-validated LightGBM baseline with learning curves
  • Outlier and Skew Analysis — per-feature skewness, kurtosis, log-transform recommendations
  • NLP Readiness — detects free-text columns, vocabulary analysis, embedding recommendations
  • Drift Comparison — upload train and test splits to get per-column distribution drift scores
  • Feature Redundancy — correlation matrix analysis for highly redundant feature pairs
  • Analysis Profiles — Standard, Finance, Healthcare, NLP
  • Preprocessing Script — one-click export of a production-ready Python preprocessing script

Stack

  • Frontend — React, TypeScript, Vite, Recharts
  • Backend — FastAPI, Python, LightGBM, scikit-learn, pandas

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

athena_eda-0.1.0.tar.gz (212.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

athena_eda-0.1.0-py3-none-any.whl (215.6 kB view details)

Uploaded Python 3

File details

Details for the file athena_eda-0.1.0.tar.gz.

File metadata

  • Download URL: athena_eda-0.1.0.tar.gz
  • Upload date:
  • Size: 212.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.1.0.tar.gz
Algorithm Hash digest
SHA256 94f0d30c3b060d111fad753d739e654ff0e800d27972db560f09533223fe0028
MD5 10af8274f05001ae92c820d6b0bab6fc
BLAKE2b-256 9aea629699262372c4c3548f47bc28ee7fc4a33e8e1551a58d613a22fd002d85

See more details on using hashes here.

File details

Details for the file athena_eda-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: athena_eda-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 215.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f5f4d19e5e9e7afcda7c6ebd64b192af388a2dc8b7cd29045dda1aa97c42dd3
MD5 9378673a65cb377bcbbf4370435f1fbd
BLAKE2b-256 843d8a69245c316f41a478df8056a16359696c2dbc044bf84d23a0b54a3b6d94

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page