Skip to main content

One-line Python EDA assistant: auto-summary, insights, and visual report generation

Project description

AutoView Logo

AutoView

⚡ Fast, Visual, and Insightful Exploratory Data Analysis in Python
Automatically generate EDA reports with smart suggestions, visualizations, and actionable insights

PyPI License CI

AutoView - Exploratory Data Analysis Assistant

AutoView is a powerful Python package designed to accelerate your Exploratory Data Analysis (EDA) workflow. It automatically generates insightful visualizations, statistical summaries, feature suggestions, and a full HTML report to help you understand your dataset deeply before modeling.


🚀 Features

📊 Dataset Summary

  • Dataset shape (rows × columns)
  • Column data types
  • Unique value count per column
  • Descriptive statistics for numeric columns
  • Detection of duplicate rows

🧠 Smart Insights

  • Skewed features with transformation suggestions
  • Top highly correlated feature pairs (Pearson > 0.35)
  • Feature importance using Random Forest
  • Target imbalance detection
  • Text column profiling (length, mode, uniqueness)
  • Column role classification (ID, datetime, name, high cardinality, etc.)

📈 Visualizations

  • Histograms for all numeric features
  • Boxplots (outlier detection using IQR)
  • Correlation heatmap for numeric features
  • Missing data visualization using missingno:
    • Matrix plot
    • Bar plot
    • Heatmap
    • Dendrogram
  • Target distribution bar chart (if specified)
  • Feature importance horizontal bar chart

📋 Suggested Actions

  • Encode categorical features
  • Handle skewed or correlated features
  • Remove ID or low variance columns
  • Drop or impute missing values

📄 Auto-Generated HTML Report

  • Clean, professional UI with embedded plots
  • Sectioned layout: Data types, stats, roles, visuals, insights
  • No external image files needed (all plots are embedded using base64)

📦 Installation

pip install autoview  # (Coming Soon to PyPI)

Or clone locally:

git clone https://github.com/avinash-betha/autoview.git
cd autoview
pip install -r requirements.txt

🧪 Example Usage

from autoview import explore
import pandas as pd

# Load any dataset
df = pd.read_csv("your_dataset.csv")

# Explore the dataset
df_summary = explore(df, target="target_column", plots=True, show_missing=True)

⚙️ Function Parameters

explore(df: pd.DataFrame, plots=False, target=None, show_missing=True)

Parameters:

Parameter Type Description
df DataFrame The dataset to analyze.
plots bool If True, shows visualizations like histograms, boxplots, heatmaps, etc.
target str / None Optional target column name for classification/regression analysis.
show_missing bool If True, shows missing data visualizations using missingno.

📁 Project Structure

autoview/
├── explore.py          # Main entry point for the EDA pipeline
├── visualize.py        # Plotting logic for histograms, boxplots, etc.
├── report.py           # HTML report generation with embedded charts
├── examples/
│   └── titanic_demo.py
└── README.md

📋 Dependencies

  • pandas
  • numpy
  • seaborn
  • matplotlib
  • scikit-learn
  • missingno
  • tabulate

📜 License

MIT License

Copyright (c) 2024 Avinash Betha

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


🙌 Contributing

We welcome contributions! Please feel free to open issues or submit PRs for:

  • Adding more visualizations (e.g., violin plots, KDE plots)
  • Improving column role inference
  • Integrating data preprocessing suggestions

💡 Inspiration

AutoView aims to eliminate the tedious setup of basic EDA steps, freeing analysts and data scientists to focus on deeper insights and modeling.

If you're looking to plug EDA into your workflow with minimal effort and maximum insights, AutoView is your go-to!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoview-0.1.0.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoview-0.1.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file autoview-0.1.0.tar.gz.

File metadata

  • Download URL: autoview-0.1.0.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for autoview-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9c58e9f025e6ca07002d61982791fb0150bfe6839e43b2ae8a5a14b5529ac45e
MD5 59e41ab1561f289980b60d9aaf8d80c7
BLAKE2b-256 c2a6c6ff9f3cb3ee6316c3d2b15f265bfe0c15265319be2d93a1a4e7fab85f01

See more details on using hashes here.

File details

Details for the file autoview-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: autoview-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for autoview-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 26417baeba9fb4032b055e72f397ed02f3b59fe2b20e4fbca5eb152b197c6103
MD5 cfbe426349f62138b195f478137709dc
BLAKE2b-256 94a624d828585b8b2bfe0ff5f3bce7cfe534fa76675b174e50d9f6e57942698e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page