Skip to main content

Smart EDA and model recommendation helper

Project description

Vizion

Vizion is a lightweight, notebook-friendly helper for quick exploratory data analysis (EDA).
It bundles the repetitive parts of dataset inspection—summaries, missing data checks, visualizations, outlier handling, and walkthrough tips—into a single class so you can focus on insights.

Installation

pip install vizion

or, for local development:

pip install -e .

Quick Start

import pandas as pd
import vizion as vz

df = pd.read_csv("data.csv")

# Simple summary in one line
vz.quick_summary(df)

# Visual EDA helpers
vz.plot_numeric_eda(df, plot_types=["hist", "box"])
vz.plot_categorical_eda(df, target="SalePrice")

# Data hygiene helpers
df_clean, report = vz.handle_missing(df)
df_no_outliers = vz.handle_outliers(df_clean)

# Guided workflow
vz.help_steps()

API Reference

All functions live under the Vizion class but are also re-exported at module level (vizion.quick_summary, etc.).

  • quick_summary(df, show=True)
    Prints (or returns) dataset shape, numeric/categorical/datetime column counts, duplicated rows, and top missing columns.

  • missing_value_summary(df)
    Returns a DataFrame detailing missing counts, percentages, dtypes, and suggested remediation per column.

  • get_column_types(df)
    Identifies numeric, categorical, and missing columns, storing them in globals (num_col, cat_col, mis_col) for quick reuse.

  • plot_numeric_eda(df, columns=None, plot_types=None, ..., show_corr=True, show_pairplot=False)
    Generates grid plots (hist, kde, box, violin, scatter, line, area) for numeric columns, with optional correlation heatmap and pairplot.

  • plot_categorical_eda(df, columns=None, plot_types=None, ..., top_n=20)
    Creates count/pie plots for categorical features, and supports target-aware bar/box/violin visuals.

  • handle_outliers(df, cols=None, method="auto", threshold=1.5, visualize=True, ...)
    Detects numeric outliers via IQR or IsolationForest, optionally caps/removes values and shows before/after boxplots.

  • handle_missing(df, drop_threshold=0.75, numeric_strategy="median", categorical_strategy="mode", datetime_strategy="ffill")
    Drops ultra-sparse columns and fills the rest according to strategy, returning the cleaned DataFrame plus a fill/drop report.

  • generate_doc(filename=None)
    Auto-builds Markdown docs for every Vizion method; prints to stdout or saves to a file.

  • help_steps()
    Prints a recommended, step-by-step EDA workflow that chains Vizion helpers in a sensible order.

Contributing

Issues and PRs are welcome! See the open tasks in the GitHub repo or propose your own improvements (encoding, scaling, imbalance checks, etc.).

License

MIT © Milind Chaudhari

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vizion-0.1.4.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vizion-0.1.4-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file vizion-0.1.4.tar.gz.

File metadata

  • Download URL: vizion-0.1.4.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for vizion-0.1.4.tar.gz
Algorithm Hash digest
SHA256 43cdfd8f409304d7a80da829852f6592eae3c763feaba46c29c3a844e8fda4d3
MD5 d48ee1a294582cf8ff6f67a8d0bcb767
BLAKE2b-256 959f9bdc8c0b3743c0ae627cc501c1390d6b3ab83b7ba0d413af3ca846bbdc04

See more details on using hashes here.

File details

Details for the file vizion-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: vizion-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for vizion-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 fe4f16a530d0efacd0e3af7f5179c71608f7691da18bfba566d1e77e8f959c45
MD5 877d03cc0625017db4dd83d37c6d5f77
BLAKE2b-256 a490094d457c41e85fd7490e523a57342a59954f4b2f5b7a445d13c440b4da5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page