Skip to main content

Smart EDA and model recommendation helper

Project description

Vizion

Vizion is a lightweight, notebook-friendly helper for quick exploratory data analysis (EDA).
It bundles the repetitive parts of dataset inspection—summaries, missing data checks, visualizations, outlier handling, and walkthrough tips—into a single class so you can focus on insights.

Installation

pip install vizion

or, for local development:

pip install -e .

Quick Start

import pandas as pd
import vizion as vz

df = pd.read_csv("data.csv")

# Simple summary in one line
vz.quick_summary(df)

# Visual EDA helpers
vz.plot_numeric_eda(df, plot_types=["hist", "box"])
vz.plot_categorical_eda(df, target="SalePrice")

# Data hygiene helpers
df_clean, report = vz.handle_missing(df)
df_no_outliers = vz.handle_outliers(df_clean)

# Guided workflow
vz.help_steps()

API Reference

All functions live under the Vizion class but are also re-exported at module level (vizion.quick_summary, etc.).

  • quick_summary(df, show=True)
    Prints (or returns) dataset shape, numeric/categorical/datetime column counts, duplicated rows, and top missing columns.

  • missing_value_summary(df)
    Returns a DataFrame detailing missing counts, percentages, dtypes, and suggested remediation per column.

  • get_column_types(df)
    Identifies numeric, categorical, and missing columns, storing them in globals (num_col, cat_col, mis_col) for quick reuse.

  • plot_numeric_eda(df, columns=None, plot_types=None, ..., show_corr=True, show_pairplot=False)
    Generates grid plots (hist, kde, box, violin, scatter, line, area) for numeric columns, with optional correlation heatmap and pairplot.

  • plot_categorical_eda(df, columns=None, plot_types=None, ..., top_n=20)
    Creates count/pie plots for categorical features, and supports target-aware bar/box/violin visuals.

  • handle_outliers(df, cols=None, method="auto", threshold=1.5, visualize=True, ...)
    Detects numeric outliers via IQR or IsolationForest, optionally caps/removes values and shows before/after boxplots.

  • handle_missing(df, drop_threshold=0.75, numeric_strategy="median", categorical_strategy="mode", datetime_strategy="ffill")
    Drops ultra-sparse columns and fills the rest according to strategy, returning the cleaned DataFrame plus a fill/drop report.

  • generate_doc(filename=None)
    Auto-builds Markdown docs for every Vizion method; prints to stdout or saves to a file.

  • help_steps()
    Prints a recommended, step-by-step EDA workflow that chains Vizion helpers in a sensible order.

Contributing

Issues and PRs are welcome! See the open tasks in the GitHub repo or propose your own improvements (encoding, scaling, imbalance checks, etc.).

License

MIT © Milind Chaudhari

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vizion-0.1.6.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vizion-0.1.6-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file vizion-0.1.6.tar.gz.

File metadata

  • Download URL: vizion-0.1.6.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for vizion-0.1.6.tar.gz
Algorithm Hash digest
SHA256 242edc65aab5eb9d18fa13b2c2ef1d2eb12805a44eb8c50703e1e80374247820
MD5 971d7102752362deee99d7ede623cffa
BLAKE2b-256 832464d56d0efd2a5912104b36ded1c9d13577822ea97d197211a4cc06b184f5

See more details on using hashes here.

File details

Details for the file vizion-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: vizion-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for vizion-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 189f3bc914c1cde93a9885e1999cae07ea38e07adc4f97fe270c99ebe0bd736c
MD5 6573c87d832c6fdce02289e5085cb5ba
BLAKE2b-256 fad9f2e20015c660e52fc61afb11fb926eaa5437652e5f479ed9f022238b06e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page