Automates Exploratory Data Analysis (EDA) for any pandas DataFrame
Project description
smarteda
Automate your Exploratory Data Analysis in one line of code.
smarteda is a Python package that eliminates repetitive EDA code. Instead of writing dozens of pandas lines every time you get a new dataset, smarteda analyzes it instantly, gives smart suggestions, and generates a full HTML report.
Installation
pip install smarteda
Quick Start
import pandas as pd
import smarteda
df = pd.read_csv("your_data.csv")
# Run everything at once
smarteda.analyze(df)
# Or generate a full HTML report
smarteda.report(df, output_file="report.html")
Functions
| Function | Description |
|---|---|
smarteda.basic_eda(df) |
Head, tail, sample, shape, size, info, describe |
smarteda.overview(df) |
Shape, memory, data types, constant columns, wrong type detection |
smarteda.missing(df) |
Missing value counts, percentages, heatmap, fill suggestions |
smarteda.duplicates(df) |
Count and show duplicate rows |
smarteda.duplicates(df, drop=True) |
Drop duplicates and return clean DataFrame |
smarteda.outliers(df) |
IQR, Z-score, and Isolation Forest outlier detection |
smarteda.distributions(df) |
Skewness, kurtosis, transformation suggestions, histogram plots |
smarteda.correlations(df) |
Pearson/Spearman/Kendall correlation, multicollinearity warnings |
smarteda.categorical(df) |
Value counts, high cardinality detection, encoding suggestions |
smarteda.timeseries(df) |
Auto datetime detection, trends, seasonality, gap detection |
smarteda.suggestions(df) |
Smart recommendations + ML Readiness Score out of 100 |
smarteda.clean(df) |
Auto clean — returns a new cleaned DataFrame |
smarteda.clean(df, inplace=True) |
Auto clean — modifies original DataFrame directly |
smarteda.visualize(df) |
Auto charts for every column |
smarteda.analyze(df) |
Runs ALL functions above in one call |
smarteda.report(df) |
Generates a full standalone HTML report |
Examples
Basic EDA
smarteda.basic_eda(df) # default 5 rows
smarteda.basic_eda(df, n=10) # show 10 rows
Missing Values
smarteda.missing(df)
# Output:
# Count Percentage
# age 21 10.24
# salary 15 7.32
# Suggestion: age → Fill with mean | salary → Fill with median
Outlier Detection
smarteda.outliers(df)
# Output:
# salary → 8 outliers (3.9%) using IQR
# score → 1 outliers (0.49%) using Z-score
# Multi-dimensional (Isolation Forest) → 39 outliers (19.02%)
Smart Suggestions + ML Score
smarteda.suggestions(df)
# Output:
# ⚠️ Column 'salary' is highly skewed → apply log transformation
# ⚠️ 'height' and 'weight' are 94% correlated → drop one
# ✅ No duplicates found
# 💡 ML Readiness Score: 87 / 100
Auto Clean
# Safe — keeps original df intact
clean_df = smarteda.clean(df)
# Modifies df directly
smarteda.clean(df, inplace=True)
HTML Report
smarteda.report(df, output_file="my_report.html")
# Opens in browser — no extra tools needed
What smarteda Detects Automatically
- ✅ Missing values with fill strategy per column
- ✅ Duplicate rows
- ✅ Outliers using 3 methods (IQR, Z-score, Isolation Forest)
- ✅ Skewed distributions with transformation suggestions
- ✅ Multicollinearity between features
- ✅ High cardinality categorical columns
- ✅ Wrong data types (numbers stored as strings, dates as objects)
- ✅ Constant columns (useless for ML)
- ✅ Time series trends, seasonality, and gaps
- ✅ ML Readiness Score out of 100
Dependencies
- pandas
- numpy
- matplotlib
- seaborn
- scipy
- scikit-learn
- jinja2
- missingno
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smarteda-0.1.1.tar.gz.
File metadata
- Download URL: smarteda-0.1.1.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b5ed904c6bfc4b45881a646dbdb737ed824f8e4757f56e08ca28653415ac16f
|
|
| MD5 |
c89509921e722ea95fd446622f0f85ba
|
|
| BLAKE2b-256 |
cc25d0716d86437dedc9726d78e2876dd5e6f053181556a5cc06d4f99d3efb57
|
File details
Details for the file smarteda-0.1.1-py3-none-any.whl.
File metadata
- Download URL: smarteda-0.1.1-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d26e258167399d3c0bd29019c70d564500e596cf87227bbb5cd1895bf7f15e7f
|
|
| MD5 |
e0aedb4b58a8446d217f987125f51023
|
|
| BLAKE2b-256 |
0cfcf94c23808d743f9a82e11cb4cad5f47c10f4fb7f176784fa8ae64034da8c
|