This module brings different functions to make EDA, data cleaning easier.
Project description
Data Inspector
Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.
Author: Kazi Amit Hasan
Project Description:
Data Inspector brings 15++ essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.
Installation:
pip install data-inspector
Package available at https://pypi.org/project/data-inspector/
Available automation:
- Line plot :
line_plot(data, x_data, y_data, x_label="", y_label="", title="")
- Skew feature:
plot_skewed_feature(data, column)
- Showing data distribution:
show_distribution(data, column)
- Scatter plot:
plot_scatter(data,x_data, y_data)
- Correlation plot:
plot_correlation(data)
- Create histogram:
histogram(data,column, x_label, y_label, title)
- Create bar plot:
plot_bar(data, column, xlabel, ylabel, title)
- Create boxplots of all features:
box_plot(data)
- Checking dataset's shape:
datasetShape(data)
- Get dataset's diagnostic plots:
diagnostic_plots(data, variable)
- Divide numerical and categorical features:
divideFeatures(data)
- Fill NaN values:
fillNan(data, column, value)
- Get pearson's correlation between two variables:
get_correlation(column_1, column_2, data)
- Plotting kde plots:
plot_cont_kde(data, var)
- Automatic calculating the missing values and their percentage along with visualization :
calculating_missing_values(data)
- Regression plot with 95% CI :
plot_regplot(data,x_data, y_data)
Tutorial:
Link: https://github.com/AmitHasanShuvo/data-inspector/blob/main/notebook/example%20notebook.ipynb
Colab link: https://colab.research.google.com/drive/1mj9gz2XyQprSYdKMUKlKkJ9Qi8XmleHW?usp=sharing
Some visualizations:
Available at: https://github.com/AmitHasanShuvo/data-inspector
How to cite:
@online{data-inspector,
title={data-inspector},
url={https://pypi.org/project/data-inspector/},
urldate = {2021-08-21},
publisher={Kazi Amit Hasan}
}
Future Works:
- Add some automations for time series data.
How to contribute:
Any contribution would be highly appreciated. Kindly go through the guidelines for contributing in github.
Change Log
0.0.1 (20/08/2021)
- First Release
0.0.2 (20/08/2021)
- Minor updates
0.0.3 (20/08/2021)
- Minor updates
0.0.4 (20/08/2021)
- Minor updates
0.0.5 (20/08/2021)
- Minor updates
0.0.6 (20/08/2021)
- Minor updates
0.0.8 (21/08/2021)
- Minor updates
1.1 (21/08/2021)
- Minor updates
1.5.4 (29/08/2021)
- Regression plot functions added
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for data_inspector-1.5.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a10b487fc5b05541fdf28fcd329ce36b84a610699dd5297a78bf001ae332e684 |
|
MD5 | 08a12dc3332940fdc42cdf45a42bbe9c |
|
BLAKE2b-256 | 687aba7389a6386f439773e26cbd9c98f5b525ad82fba5626695a01df6410f10 |