edatk: python exploratory data analysis toolkit
Project description
edatk: Python Exploratory Data Analysis Toolkit
edatk is a open source project for exploratory data analysis in Python. This is a new project and while features are simple now, the goal is to automate and organize as much of the traditional eda workflow as possible.
Installation
pip install edatk
Examples and Getting Started
# Import library
import edatk as eda
# Load in your dataframe (using seaborn below as an example)
import seaborn as sns
df = sns.load_dataset('iris')
# Run auto eda, optionally pass in path for saving html report and target column
eda.auto_eda(df, save_path='C:\\Users\\username\\Documents\\edatk', target_column='species')
Feature Overview
Feature [status]
- Tabular data [partial]
- Column by column analysis [partial]
- Basic descriptive statistics (mean, median, min, max, etc) [completed]
- Distribution charts (numeric) and most frequent values (categorical) [completed]
- Normality Tests [planned].
- Relationships between columns [completed]
- TSNE [planned]
- Basic feature -> target analysis and feature importance [planned]
- Autofind interesting relationships and features [planned]
- Basic exploratory NLP for text columns [planned]
- Column by column analysis [partial]
- Exploring Predicted vs. True Results [planned]
- Classification Results Plots
- True vs. Predicted Heatmap by Class
- Mosiac Plot
- Classification Results Plots
- Time Series [planned]
- Performance Improvements [planned]
- Operation timeouts
Contributing
If you are interested in contributing, please see the contributing documentation.
Stability
This library is not yet ready for production use. Treat with caution and for non production purposes aiding in deeper, more formal data analysis.
Author
- Barrett Studdard - @bstuddard
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
edatk-0.0.8.tar.gz
(19.8 kB
view hashes)
Built Distribution
edatk-0.0.8-py3-none-any.whl
(34.3 kB
view hashes)