Exploratory Data Analysis
Project description
exploratory
Exploratory Data Analysis
Description
This project explortory was created to perform Exploratory Data Analysis on any structured dataset. Dataset can have categorical or numerical data types. This project takes pandas dataframe and gives summary statistics and individual plots having categorical count for catagorical variables and PDF's, CDF's with mean, median and mode for numerical variables. The both the results are stored in PDF and CSV file in your current directory/path.
Installation:
Use the package manager pip to install exploratory
pip install exploratory
Usage:
from exploratory import EDA
EDA(df)
# df --> pandas dataframe
#Please input the DPI value, as DPI value increases runtime would increase. Defualt DPI value: 150
Example Run:
Expected Outputs:
- CSV File, DataFrame Containing
Column | Description |
---|---|
Variable | Variable Name in the dataset provided |
Cardinality | Number of levels/classes in each variable |
total_count | Count of total records (non null) |
unique_rate | Cardinality / total_count, Unique Rate of 1 indicates a ID variable |
percent_missing | Percentage of missing values across each column |
mean | Average of column (Ignores Object/String variables) |
std | Standard deviation of column (Ignores Object/String variables) |
min | Minimum of column (Ignores Object/String variables) |
25% | 25th percentile value of column (Ignores Object/String variables) |
median | 50th percentile value of column (Ignores Object/String variables) |
75% | 75th percentile value of column (Ignores Object/String variables) |
max | Maximum of column (Ignores Object/String variables) |
data_types | Data type of column (Int / Float / Object etc) |
range | Max Value - Min Value (Ignores Object/String variables) |
- PDF with Statistical Summary and variable distribution graphs (categorical & continous)
Contributing
Pull requests are welcome. Please use this 'https://github.com/Abhilash-MS/exploratory' Please feel free to contact authors for any suggestions or issues, Ram kakarlaramcharan@gmail.com, Abhilash abhilashmaspalli1996@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file exploratory-3.4.12.tar.gz
.
File metadata
- Download URL: exploratory-3.4.12.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb088cc7fbe3ab79aab5da669c85c7bd9a188aa89c5e9a7e73168ac8f05360a9 |
|
MD5 | c32d83b06c74948f294479fc77d5a6e3 |
|
BLAKE2b-256 | 14b2cb472ba23229218e7878a9b4841b12f3e858a632b25373865bc6cefb1ce1 |