Skip to main content

Exploratory Data Analysis

Project description

exploratory

Exploratory Data Analysis

Description

This project explortory was created to perform Exploratory Data Analysis on any structured dataset. Dataset can have categorical or numerical data types. This project takes pandas dataframe and gives summary statistics and individual plots having categorical count for catagorical variables and PDF's, CDF's with mean, median and mode for numerical variables. The both the results are stored in PDF and CSV file in your current directory/path.

Installation:

Use the package manager pip to install exploratory

pip install exploratory

Usage:

from exploratory import EDA
EDA(df)
# df --> pandas dataframe
#Please input the DPI value, as DPI value increases runtime would increase. Defualt DPI value: 150

Example Run:

Exploratory Run

Expected Outputs:

  • CSV File, DataFrame Containing
Column Description
Variable Variable Name in the dataset provided
Cardinality Number of levels/classes in each variable
total_count Count of total records (non null)
unique_rate Cardinality / total_count, Unique Rate of 1 indicates a ID variable
percent_missing Percentage of missing values across each column
mean Average of column (Ignores Object/String variables)
std Standard deviation of column (Ignores Object/String variables)
min Minimum of column (Ignores Object/String variables)
25% 25th percentile value of column (Ignores Object/String variables)
median 50th percentile value of column (Ignores Object/String variables)
75% 75th percentile value of column (Ignores Object/String variables)
max Maximum of column (Ignores Object/String variables)
data_types Data type of column (Int / Float / Object etc)
range Max Value - Min Value (Ignores Object/String variables)
  • PDF with Statistical Summary and variable distribution graphs (categorical & continous)

Exported PDF

Contributing

Pull requests are welcome. Please use this 'https://github.com/Abhilash-MS/exploratory' Please feel free to contact authors for any suggestions or issues, Ram kakarlaramcharan@gmail.com, Abhilash abhilashmaspalli1996@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

exploratory-3.4.12.tar.gz (7.0 kB view details)

Uploaded Source

File details

Details for the file exploratory-3.4.12.tar.gz.

File metadata

  • Download URL: exploratory-3.4.12.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for exploratory-3.4.12.tar.gz
Algorithm Hash digest
SHA256 bb088cc7fbe3ab79aab5da669c85c7bd9a188aa89c5e9a7e73168ac8f05360a9
MD5 c32d83b06c74948f294479fc77d5a6e3
BLAKE2b-256 14b2cb472ba23229218e7878a9b4841b12f3e858a632b25373865bc6cefb1ce1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page