This module brings different functions to make EDA, data cleaning easier.
Project description
Data Inspector
Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.
Author: Kazi Amit Hasan
Project Description:
Data Inspector brings 15++ essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.
Installation:
pip install data-inspector
Package available at https://pypi.org/project/data-inspector/
Available automation:
- Line plot :
line_plot(data, x_data, y_data, x_label="", y_label="", title="")
- Skew feature:
plot_skewed_feature(data, column)
- Showing data distribution:
show_distribution(data, column)
- Scatter plot:
plot_scatter(data,x_data, y_data)
- Correlation plot:
plot_correlation(data)
- Create histogram:
histogram(data,column, x_label, y_label, title)
- Create bar plot:
plot_bar(data, column, xlabel, ylabel, title)
- Create boxplots of all features:
box_plot(data)
- Checking dataset's shape:
datasetShape(data)
- Get dataset's diagnostic plots:
diagnostic_plots(data, variable)
- Divide numerical and categorical features:
divideFeatures(data)
- Fill NaN values:
fillNan(data, column, value)
- Get pearson's correlation between two variables:
get_correlation(column_1, column_2, data)
- Plotting kde plots:
plot_cont_kde(data, var)
- Automatic calculating the missing values and their percentage along with visualization :
calculating_missing_values(data)
- Regression plot with 95% CI :
plot_regplot(data,x_data, y_data)
Tutorial:
Link: https://github.com/AmitHasanShuvo/data-inspector/blob/main/notebook/example%20notebook.ipynb
Colab link: https://colab.research.google.com/drive/1mj9gz2XyQprSYdKMUKlKkJ9Qi8XmleHW?usp=sharing
Some visualizations:
Available at: https://github.com/AmitHasanShuvo/data-inspector
How to cite:
@online{data-inspector,
title={data-inspector},
url={https://pypi.org/project/data-inspector/},
urldate = {2021-08-21},
publisher={Kazi Amit Hasan}
}
Future Works:
- Add some automations for time series data.
How to contribute:
Any contribution would be highly appreciated. Kindly go through the guidelines for contributing in github.
Change Log
0.0.1 (20/08/2021)
- First Release
0.0.2 (20/08/2021)
- Minor updates
0.0.3 (20/08/2021)
- Minor updates
0.0.4 (20/08/2021)
- Minor updates
0.0.5 (20/08/2021)
- Minor updates
0.0.6 (20/08/2021)
- Minor updates
0.0.8 (21/08/2021)
- Minor updates
1.1 (21/08/2021)
- Minor updates
1.5.4 (29/08/2021)
- Regression plot functions added
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file data_inspector-1.5.5.tar.gz
.
File metadata
- Download URL: data_inspector-1.5.5.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 522c5e91f809fdecd30b5d2ef95b0f8023dbfc2673232879600613ff7fe54096 |
|
MD5 | 4b3e9eb874b514e4d0b45806764418ae |
|
BLAKE2b-256 | bf37027555c3378b0701fbd88ca436a6a7b69d7ed0e3006f41d636cee8e54046 |
File details
Details for the file data_inspector-1.5.5-py3-none-any.whl
.
File metadata
- Download URL: data_inspector-1.5.5-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a10b487fc5b05541fdf28fcd329ce36b84a610699dd5297a78bf001ae332e684 |
|
MD5 | 08a12dc3332940fdc42cdf45a42bbe9c |
|
BLAKE2b-256 | 687aba7389a6386f439773e26cbd9c98f5b525ad82fba5626695a01df6410f10 |