Skip to main content

Kesh Utils for Data science/EDA/Data preparation

Project description

Chart + Util = Chartil

During EDA/data preparation stage, I use few fixed chart types to analyse the relation among various features. Few are simple chart like univariate and some are complex 3D or even multiple features>3.

Over the period it became complex to maintain all relevant codes or repeat codes. Instead I developed a simple, single api to plot various type of relations which will hide all technical/code details from Data Science task and approch.

Using this approach I just need one api

from KUtils.eda import chartil

chartil.plot(dataframe, [list of columns]) or
chartil.plot(dataframe, [list of columns], {optional_settings})

Demo code:

Load UCI Dataset. Download From here

heart_disease_df = pd.read_csv('../input/uci/heart.csv')

heart_disease_df['age_bin'] = pd.cut(heart_disease_df['age'], [0, 32, 40, 50, 60, 70, 100], labels=['<32', '33-40','41-50','51-60','61-70', '71+']) heart_disease_df['sex'] = heart_disease_df['sex'].map({1:'Male', 0:'Female'})

Heatmap

chartil.plot(heart_disease_df, heart_disease_df.columns) # Send all column names ![Heatmap Numerical] (https://raw.githubusercontent.com/KeshavShetty/ds/master/Roughbook/misc_resources/heatmap1.png) chartil.plot(heart_disease_df, heart_disease_df.columns, optional_settings={'include_categorical':True} ) ![Heatmap With categorical] (https://raw.githubusercontent.com/KeshavShetty/ds/master/Roughbook/misc_resources/heatmap2.png) chartil.plot(heart_disease_df, heart_disease_df.columns, optional_settings={'include_categorical':True, 'sort_by_column':'trestbps'} ) ![Heatmap With categorical and ordered by a column] (https://raw.githubusercontent.com/KeshavShetty/ds/master/Roughbook/misc_resources/heatmap3.png)

Uni-categorical

chartil.plot(heart_disease_df, ['target']) # Barchart as count plot Uni Categorical

Uni-Continuous

chartil.plot(heart_disease_df, ['age']) # boxplot Uni boxplot

chartil.plot(heart_disease_df, ['age'], chart_type='barchart') # Force barchart on cntinuous by auto creating 10 equal bins Uni barchart_forced

chartil.plot(heart_disease_df, ['age'], chart_type='barchart', optional_settings={'no_of_bins':5}) # Create custom number of bins Uni uni_barchart_forced_custom_bin_size

chartil.plot(heart_disease_df, ['age'], chart_type='distplot') Uni distplot

Uni-categorical with optional_settings

chartil.plot(heart_disease_df, ['age_bin']) # Barchart as count plot chartil.plot(heart_disease_df, ['age_bin'], optional_settings={'sort_by_value':True}) chartil.plot(heart_disease_df, ['age_bin'], optional_settings={'sort_by_value':True, 'limit_bars_count_to':5})

Bi Category vs Category (& Univariate Segmented)

chartil.plot(heart_disease_df, ['sex', 'target']) chartil.plot(heart_disease_df, ['sex', 'target'], chart_type='crosstab') chartil.plot(heart_disease_df, ['sex', 'target'], chart_type='stacked_barchart')

Bi Continuous vs Continuous

chartil.plot(heart_disease_df, ['chol', 'thalach']) # Scatter plot

Bi Continuous vs Category

chartil.plot(heart_disease_df, ['thalach', 'sex']) # Grouped box plot (Segmented univariate) chartil.plot(heart_disease_df, ['thalach', 'sex'], chart_type='distplot') # Distplot

Multi 3 Continuous

chartil.plot(heart_disease_df, ['chol', 'thalach', 'trestbps']) # Colored 3D scatter plot

Multi 3 Categorical

chartil.plot(heart_disease_df, ['age_bin', 'sex', 'target']) # Paired barchart

Multi 2 Continuous, 1 Category

chartil.plot(heart_disease_df, ['chol', 'thalach', 'target']) # Scatter plot with colored groups Grouped Scatter plot

Multi 1 Continuous, 2 Category

chartil.plot(heart_disease_df, ['thalach', 'sex', 'target']) # Grouped boxplot chartil.plot(heart_disease_df, ['thalach', 'sex', 'target'], chart_type='violinplot') # Grouped violin plot

Multi 3 Continuous, 1 category

chartil.plot(heart_disease_df, ['chol', 'thalach', 'trestbps', 'target']) # Group Color highlighted 3D plot

Multi 3 Continuous, 2 category

chartil.plot(heart_disease_df, ['sex','cp','target','thalach','trestbps']) # Paired scatter plot

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kesh-utils-0.1.8.tar.gz (12.2 kB view hashes)

Uploaded Source

Built Distribution

kesh_utils-0.1.8-py3-none-any.whl (13.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page