Skip to main content

Ultra-lightweight micro EDA (exploratory data analysis) tool for small datasets

Project description

microeda

microeda is an ultra-lightweight Python library for Exploratory Data Analysis (EDA) on small datasets. It provides quick insights into your data with minimal setup—detecting column types, summarizing distributions, spotting missing values, outliers, and exploring pairwise relationships.


✨ Features

  • Detect column types: numeric, categorical, boolean, datetime, text, or IDs
  • Summarize numeric columns: mean, std, quartiles, missing values, outliers
  • Summarize categorical columns: top values, unique counts
  • Summarize datetime columns: min, max, missing values
  • Quick text analysis: token counts, most frequent words
  • Missing value patterns and pairwise missing correlations
  • Outlier detection: IQR and Z-score methods
  • Pairwise hints for correlations and associations (Pearson, Cramer's V, Mutual Information)
  • Command-line interface (CLI) for generating reports in Markdown or HTML

📦 Installation

Install via PyPI:

pip install microeda

Or install from source:

git clone https://github.com/SaptarshiMondal123/microeda.git
cd microeda
pip install .

Usage

Python API

import pandas as pd
from microeda import analyze

df = pd.read_csv("your_data.csv")
report = analyze(df, name="My Dataset")

# View summaries
print(report["summaries"])
# View column types
print(report["column_types"])
# Missing values and pairwise hints
print(report["missingness"])
print(report["pairwise_hints"])

CLI

Generate a Markdown report directly from the terminal:

microeda path/to/data.csv --style md --out report.md

Options:

--style: md (Markdown) or html (HTML)

--out: output file path

Contributing

Contributions are welcome! Feel free to submit issues or pull requests.

  • Fork the repo

  • Create a new branch (git checkout -b feature-name)

  • Make your changes

  • Run tests (pytest)

  • Submit a pull request

License

MIT License © 2025 Saptarshi Mondal

Links

GitHub: https://github.com/SaptarshiMondal123/microeda

PyPI: https://pypi.org/project/microeda/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microeda-0.3.0.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

microeda-0.3.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file microeda-0.3.0.tar.gz.

File metadata

  • Download URL: microeda-0.3.0.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for microeda-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0613fd397530b39f37eaefb9b689aeafe614e0816daeb8799e4171b234aa8bc3
MD5 e3ea21c9b3b3657575644b6f5d7e590f
BLAKE2b-256 198d6cec03ba236681053d6954ba9814d542c594affa421bfc7183114f9f1dec

See more details on using hashes here.

File details

Details for the file microeda-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: microeda-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for microeda-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 12b1daa756d7dc3c5d973ebfcd7c8d26d0a207f9c89fc08aae33fd473e08cd33
MD5 6b1f98ecb21bc7343e02da08c0266277
BLAKE2b-256 b49cdf49385866040ee305142ba97f10de30edc7097159a344cd27c63e03be12

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page