A robust, dataset-agnostic loader, cleaner, and automated interactive visual pairplot dashboard engine.

These details have not been verified by PyPI

Project links

Homepage

Project description

ezclean-data

A comprehensive, dataset-agnostic Python library for automated data ingestion, cleaning, preprocessing, and exploratory data analysis.

ezclean-data streamlines the repetitive tasks involved in preparing datasets by providing intelligent loading, automated sanitization, statistical summaries, advanced visualizations, and standalone interactive dashboards.

Overview

Data preparation often consumes a significant portion of the data science workflow. ezclean-data provides a unified interface that automatically loads structured datasets, cleans inconsistencies, handles missing values and outliers, standardizes column names, and generates insightful visualizations with minimal code.

The library is designed to work across a wide variety of dataset formats and structures, making it suitable for students, researchers, analysts, and machine learning practitioners.

Features

Smart Data Loading

Automatically detects file formats and selects the appropriate Pandas engine.
Supports both local files and remote URLs.
Handles a broad range of structured data formats.

Automated Data Cleaning

Standardizes column names into consistent snake_case format.
Detects and replaces common invalid placeholder values.
Performs automatic type correction where appropriate.
Handles missing values using intelligent, data-type-aware strategies.
Detects and mitigates outliers using the Interquartile Range (IQR) method.

Exploratory Data Analysis

Generates detailed column summaries and completeness statistics.
Provides automatic visualizations based on column data types.
Creates generalized pairplot-style relationship matrices for rapid exploration.

Interactive Dashboard Generation

Produces self-contained HTML dashboards.
Includes summary statistics and data quality metrics.
Provides interactive Plotly-based visualizations.
Works entirely offline once generated.

Installation

Install directly from PyPI:

pip install ezclean-data

Quick Start

from ezclean import Smart_loader, Cleaner, colname, plot, plot_dashboard

# Load dataset
df = Smart_loader("tested.csv")

# Execute cleaning pipeline
df_cleaned = Cleaner(df)

# Display column statistics
colname(df_cleaned)

# Visualize a single column
plot(df_cleaned, "survived")

# Generate a relationship matrix
plot(df_cleaned)

# Create an interactive dashboard
plot_dashboard(
    df_cleaned,
    filename="my_dashboard.html"
)

API Reference

Smart_loader()

Smart_loader(file_path, **kwargs)

Automatically loads structured datasets from local storage or remote URLs.

Supported Formats

Category	Formats
Text Files	CSV, TSV, TXT
JSON Formats	JSON, JSONL, NDJSON
Spreadsheet Files	XLSX, XLS, ODS
Columnar Formats	Parquet, Feather, Arrow, ORC
Statistical Formats	SPSS, SAS, Stata
Other Formats	XML, HTML, Pickle, HDF

Cleaner()

Cleaner(df, ...)

Executes a complete data-cleaning pipeline.

Included Operations

Column Name Standardization

Converts names to snake_case
Removes special characters
Eliminates duplicate separators

Data Sanitization

Replaces common placeholder values such as:

?
NULL
null
nil
N/A
NaN

with proper missing-value representations.

Text Normalization

Trims whitespace
Standardizes string formatting

Automatic Type Detection

Converts columns to numeric types when appropriate
Preserves incompatible values

Missing Value Handling

Numerical columns → Median Imputation
Categorical columns → "Unknown" Replacement

Outlier Treatment

Uses Interquartile Range (IQR) thresholds
Removes extreme observations automatically

colname()

colname(df)

Displays detailed metadata for each column, including:

Data type
Missing value count
Completeness percentage
Unique value count
Statistical summaries

plot()

plot(df, target_column=None, columns=None)

Single-Column Visualization

When a target column is specified, the visualization is selected automatically based on data type.

Data Type	Visualization
Numeric	Histogram + Box Plot
Categorical	Bar Chart + Donut Chart
Datetime	Trend Line

Relationship Matrix

plot(df)

Generates a generalized pairplot matrix displaying:

Univariate distributions
Correlation patterns
Relationships between variables

plot_dashboard()

plot_dashboard(
    df,
    filename="ezclean_dashboard.html",
    show=True
)

Creates a standalone interactive dashboard containing:

Dataset Summary

Dataset dimensions
Completeness metrics
Data quality statistics

Column Analysis

Data types
Missing values
Unique value counts

Interactive Visualization Builder

Users can dynamically select:

X-axis variables
Y-axis variables
Plot types

without writing additional code.

Relationship Matrix

Embedded Plotly-based exploratory visualization for multivariate analysis.

Example Workflow

from ezclean import *

df = Smart_loader("data.csv")

df = Cleaner(df)

colname(df)

plot(df, "age")

plot(df)

plot_dashboard(
    df,
    filename="dashboard.html"
)

Use Cases

Data Science Projects
Machine Learning Preprocessing
Academic Research
Exploratory Data Analysis
Business Intelligence Reporting
Rapid Dataset Validation
Educational Applications

Why ezclean-data?

Most data analysis projects begin with repetitive preprocessing tasks such as loading files, cleaning columns, handling missing values, detecting outliers, and creating exploratory visualizations.

ezclean-data consolidates these operations into a simple and consistent workflow, allowing users to focus on analysis and model development rather than boilerplate data preparation code.

License

This project is licensed under the MIT License.

See the LICENSE file for complete licensing information.

Author

Developed and maintained by Thilac Ramesh.

Contributions, feature requests, and issue reports are welcome.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

May 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ezclean-0.1.0.tar.gz (23.1 kB view details)

Uploaded May 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ezclean-0.1.0-py3-none-any.whl (19.1 kB view details)

Uploaded May 30, 2026 Python 3

File details

Details for the file ezclean-0.1.0.tar.gz.

File metadata

Download URL: ezclean-0.1.0.tar.gz
Upload date: May 30, 2026
Size: 23.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for ezclean-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`d47e6ef57859487444305fcd79f4f8240d79de4cb273bcd2276d13c33006bedc`
MD5	`4b5a451baeed25d5807cdf5f0723c88f`
BLAKE2b-256	`f41bb3d7f70b5bdf5dc0f647b686c50c58c59643faebabb83790bbf3afe4e7d5`

See more details on using hashes here.

File details

Details for the file ezclean-0.1.0-py3-none-any.whl.

File metadata

Download URL: ezclean-0.1.0-py3-none-any.whl
Upload date: May 30, 2026
Size: 19.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for ezclean-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6adc98e1a6b0530c42abc776db6e1f6d471a1b6a33feb86fa47ae4c90a66f39d`
MD5	`63f69391267a933ce6d9d1643b2810ac`
BLAKE2b-256	`e15d57e99638dc2c891f1e1dd9c9be6df0eafd317b50e911998d00f3b443a976`

See more details on using hashes here.

ezclean 0.1.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

ezclean-data

Overview

Features

Smart Data Loading

Automated Data Cleaning

Exploratory Data Analysis

Interactive Dashboard Generation

Installation

Quick Start

API Reference

Smart_loader()

Supported Formats

Cleaner()

Included Operations

Column Name Standardization

Data Sanitization

Text Normalization

Automatic Type Detection

Missing Value Handling

Outlier Treatment

colname()

plot()

Single-Column Visualization

Relationship Matrix

plot_dashboard()

Dataset Summary

Column Analysis

Interactive Visualization Builder

Relationship Matrix

Example Workflow

Use Cases

Why ezclean-data?

License

Author

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes