Skip to main content

Smart EDA and model recommendation helper

Project description

Vizion

Vizion is a lightweight, notebook-friendly helper for quick exploratory data analysis (EDA).
It bundles the repetitive parts of dataset inspection—summaries, missing data checks, visualizations, outlier handling, and walkthrough tips—into a single class so you can focus on insights.

Installation

pip install vizion

or, for local development:

pip install -e .

Quick Start

import pandas as pd
import vizion as vz

df = pd.read_csv("data.csv")

# Simple summary in one line
vz.quick_summary(df)

# Visual EDA helpers
vz.plot_numeric_eda(df, plot_types=["hist", "box"])
vz.plot_categorical_eda(df, target="SalePrice")

# Data hygiene helpers
df_clean, report = vz.handle_missing(df)
df_no_outliers = vz.handle_outliers(df_clean)

# Guided workflow
vz.help_steps()

API Reference

All functions live under the Vizion class but are also re-exported at module level (vizion.quick_summary, etc.).

  • quick_summary(df, show=True)
    Prints (or returns) dataset shape, numeric/categorical/datetime column counts, duplicated rows, and top missing columns.

  • missing_value_summary(df)
    Returns a DataFrame detailing missing counts, percentages, dtypes, and suggested remediation per column.

  • get_column_types(df)
    Identifies numeric, categorical, and missing columns, storing them in globals (num_col, cat_col, mis_col) for quick reuse.

  • plot_numeric_eda(df, columns=None, plot_types=None, ..., show_corr=True, show_pairplot=False)
    Generates grid plots (hist, kde, box, violin, scatter, line, area) for numeric columns, with optional correlation heatmap and pairplot.

  • plot_categorical_eda(df, columns=None, plot_types=None, ..., top_n=20)
    Creates count/pie plots for categorical features, and supports target-aware bar/box/violin visuals.

  • handle_outliers(df, cols=None, method="auto", threshold=1.5, visualize=True, ...)
    Detects numeric outliers via IQR or IsolationForest, optionally caps/removes values and shows before/after boxplots.

  • handle_missing(df, drop_threshold=0.75, numeric_strategy="median", categorical_strategy="mode", datetime_strategy="ffill")
    Drops ultra-sparse columns and fills the rest according to strategy, returning the cleaned DataFrame plus a fill/drop report.

  • generate_doc(filename=None)
    Auto-builds Markdown docs for every Vizion method; prints to stdout or saves to a file.

  • help_steps()
    Prints a recommended, step-by-step EDA workflow that chains Vizion helpers in a sensible order.

Contributing

Issues and PRs are welcome! See the open tasks in the GitHub repo or propose your own improvements (encoding, scaling, imbalance checks, etc.).

License

MIT © Milind Chaudhari

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vizion-0.1.7.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vizion-0.1.7-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file vizion-0.1.7.tar.gz.

File metadata

  • Download URL: vizion-0.1.7.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for vizion-0.1.7.tar.gz
Algorithm Hash digest
SHA256 83d4dada1c516bc9ad8e3b23a29719892bb5102f591fb36b15d84841673cb26f
MD5 b6bb910f615203ca7c238e2ed6a3928f
BLAKE2b-256 45c718ba441dd7664edc3bd60e8e4e6fc2e6ca0bccb4d83de6d526a611d9cbf8

See more details on using hashes here.

File details

Details for the file vizion-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: vizion-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for vizion-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 25e76d5c1c57b1200de39269630c2139dd06fd5bfd29397440a24f3cdfb2e437
MD5 78c8747663b9960def7f683222e174d0
BLAKE2b-256 219899f943674ec2d6613998ebf4b09fd6d7756b74f847a01a8850f4d350ee19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page