Skip to main content

Automated EDA, ML readiness scoring, and data quality checks — locally in your browser.

Project description

Athena — ML Diagnostics Platform

Upload a CSV. Get a full diagnostic report. Download a preprocessing script.

Athena is a machine learning readiness tool that analyzes tabular datasets and surfaces what matters before you start modeling — leakage risks, class imbalance, outliers, feature redundancy, NLP readiness, and more.


How It Works

  1. Upload a CSV — up to 200k rows, labeled or unlabeled
  2. Select Supervised or Unsupervised mode. For supervised, specify the target column
  3. Choose an analysis profile: Standard, Finance, Healthcare, or NLP
  4. Athena runs a full diagnostic pass and returns a structured report
  5. Explore results across five tabs: Overview, Features, Quality, ML Diagnostics, and NLP
  6. Download a ready-to-run Python preprocessing script tailored to your dataset, or export the full report as HTML

Features

  • ML Readiness Score — composite score across data health, trainability, and leakage risk
  • Leakage Detection — flags identifier columns and suspiciously correlated features
  • Baseline Probe — 3-fold cross-validated LightGBM baseline with learning curves
  • Outlier and Skew Analysis — per-feature skewness, kurtosis, log-transform recommendations
  • NLP Readiness — detects free-text columns, vocabulary analysis, embedding recommendations
  • Drift Comparison — upload train and test splits to get per-column distribution drift scores
  • Feature Redundancy — correlation matrix analysis for highly redundant feature pairs
  • Analysis Profiles — Standard, Finance, Healthcare, NLP
  • Preprocessing Script — one-click export of a production-ready Python preprocessing script

Stack

  • Frontend — React, TypeScript, Vite, Recharts
  • Backend — FastAPI, Python, LightGBM, scikit-learn, pandas

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

athena_eda-0.1.1.tar.gz (212.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

athena_eda-0.1.1-py3-none-any.whl (215.6 kB view details)

Uploaded Python 3

File details

Details for the file athena_eda-0.1.1.tar.gz.

File metadata

  • Download URL: athena_eda-0.1.1.tar.gz
  • Upload date:
  • Size: 212.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.1.1.tar.gz
Algorithm Hash digest
SHA256 81a1e0c2ecffb89674e6b722e102d1e304e00a9b55a5ff2470f9ce6c482118a2
MD5 ee9d5adcf95f35290960649c890b0eb4
BLAKE2b-256 ffad0fab91782f7e6d05c907b37180093b5b7d9da43b4f10b8021d035b122b49

See more details on using hashes here.

File details

Details for the file athena_eda-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: athena_eda-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 215.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for athena_eda-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f70d3f4831e336a582bd6871e0a80986624c46746acde04629044c3434b8cd3e
MD5 29db4bf5aba709d334cd8243bdde5f59
BLAKE2b-256 853fec307f85c03ddc2255ad1c542d46346908555eb71d9aa4772f0681f0657f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page