Skip to main content

edatk: python exploratory data analysis toolkit

Project description

edatk: Python Exploratory Data Analysis Toolkit

edatk is a open source project for exploratory data analysis in Python. This is a new project and while features are simple now, the goal is to automate and organize as much of the traditional eda workflow as possible.

Installation

pip install edatk

Examples and Getting Started

# Import library
import edatk as eda

# Load in your dataframe (using seaborn below as an example)
import seaborn as sns
df = sns.load_dataset('iris')

# Run auto eda, optionally pass in path for saving html report and target column
eda.auto_eda(df, save_path='C:\\Users\\username\\Documents\\edatk', target_column='species')

Feature Overview

Feature [status]

  • Tabular data [partial]
    • Column by column analysis [partial]
      • Basic descriptive statistics (mean, median, min, max, etc) [completed]
      • Distribution charts (numeric) and most frequent values (categorical) [completed]
      • Normality Tests [planned].
    • Relationships between columns [completed]
    • TSNE [planned]
    • Basic feature -> target analysis and feature importance [planned]
    • Autofind interesting relationships and features [planned]
    • Basic exploratory NLP for text columns [planned]
  • Exploring Predicted vs. True Results [planned]
    • Classification Results Plots
      • True vs. Predicted Heatmap by Class
      • Mosiac Plot
  • Time Series [planned]
  • Performance Improvements [planned]
    • Operation timeouts

Contributing

If you are interested in contributing, please see the contributing documentation.

Stability

This library is not yet ready for production use. Treat with caution and for non production purposes aiding in deeper, more formal data analysis.

Author

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edatk-0.0.8.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

edatk-0.0.8-py3-none-any.whl (34.3 kB view details)

Uploaded Python 3

File details

Details for the file edatk-0.0.8.tar.gz.

File metadata

  • Download URL: edatk-0.0.8.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.5

File hashes

Hashes for edatk-0.0.8.tar.gz
Algorithm Hash digest
SHA256 470e0d17d879f2bdd8887889854f44674e614b905790150974c4988ee0d866be
MD5 705e4b121b4d93c4b258a6bdc2791491
BLAKE2b-256 aec43344cbad4976e5f4e13f7101b29c3356a29c34ff6256fcb08f0db502cf6d

See more details on using hashes here.

File details

Details for the file edatk-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: edatk-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 34.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.5

File hashes

Hashes for edatk-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 c8523250c188e7bb803ee67c43243974cad9871c2bba457c1197edb2bf414d9a
MD5 1f37497cb6abe4ec2e73ddcc6e0eb9bf
BLAKE2b-256 840047d06f97acae9347d3e0c07affadfe9c8e64b84e51499000e99fcecc4742

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page