Skip to main content

Interactive classification diagnostic plots

Project description

classgraphic

made-with-python image Dev Binder

Interactive classification diagnostic plots for scikit-learn.

coin sorting machine

We classify things for the purpose of doing something to them. Any classification which does not assist manipulation is worse than useless. - Randolph S. Bourne, "Education and Living", The Century Co (April 1917)

Major features:

Plotly based tables for:

  • class_imbalance_table
  • classification_table
  • confusion_matrix_table
  • describe (dataframe stats)
  • prediction_table
  • table

And the following charts:

  • class_imbalance
  • class_error
  • det
  • feature_importance
  • missing
  • precision_recall
  • roc
  • prediction_histogram
  • threshold

For clustering:

  • Delauney triangulations
  • Voronoi tessalations

Try it

Binder

By trying it on binder, you'll see all the details and interactivity. The quickstart below has static images, but if you run these commands in a jupyter notebook, ipython or IDE you will be able to interact with them.

Quickstart

from classgraphic.essential import *

# loading the data
df = px.data.iris()

# let's see what kind of data we have
describe(df, transpose=True).show()

dataframe describe tale

# any missing?
missing(df)

dataframe describe tale

# features
X = df.drop(columns=["species", "species_id"])

#target
y = df["species"]

# Let's check our classes we will be training on and predicting
class_imbalance_table(y, condition="all")

dataframe describe tale

# train / test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.5, random_state=random_state
)

# we want to see total count for each, default for bars is to be stacked, so that works
# we could also pass to class_imbalance barmode="overlay" if we prefer
class_imbalance(y_train, y_test, condition="train,test")

dataframe describe tale

# model
model = LogisticRegression(max_iter=max_iter, random_state=random_state)
model.fit(X_train, y_train)

# predictions
y_score = model.predict_proba(X_test)
y_pred = model.predict(X_test)

confusion_matrix_table(model, y_test, y_pred).show()
classification_table(model, y_test, y_pred)

dataframe describe tale dataframe describe tale

feature_importance(model, y, transpose=True)

dataframe describe tale

This concludes the quickstart. There are many more visualizations and tables to explore.

See the notebooks and docs folders on github and the documentation web site for more information.

Requirements

  • Python 3.8 or later
  • numpy
  • pandas
  • plotly>=5.0
  • scikit-learn
  • nbformat

Install

If you use conda, create an environment named classgraphic, then activate it:

  • in Linux: source activate pilot

  • In Windows: conda activate pilot

If you use another environment management create and activate your environment using the normal steps.

Then execute:

python setup.py install

or for installing in development mode:

python -m pip install -e . --no-build-isolation

or alternatively

python setup.py develop

To install from github instead:

pip install git+https://github.com/dionresearch/classgraphic

See also

  • stemgraphic python package for visualization of data and text
  • Hotelling one and two sample Hotelling T2 tests, T2 and f statistics and univariate and multivariate control charts and anomaly detection

History

0.3.1 (2023-09-20)

  • bugfix for describe with pandas 2.x

0.3.0 (2023-05-01)

  • added 2D clustering visualization
  • defaults to Voronoi tessalation
  • optional Delauney triangulation

0.2.1 (2022-09-20)

  • fixed image not showing on pypi
  • fixed feature importance error
  • warning = False didn't prevent warning to show

0.2.1 (2022-09-19)

  • added binary classification notebook example
  • fixed issue with non dataframe binary classification

0.2.0 (2022-09-18)

The previous version was a first step to doing a public release. This release added:

  • documented
  • updated the code to be in line with plotly 5.x

It was released to github and pypy.

0.1.0 (2019-10-27)

  • First private release

Origins

Inspired by Dion Research LLC Internal EDA/anomaly and end to end data science platform. A dozen charts and tables were initially designed to provide better diagnostic reporting. Some can also be used for exploratory or explanatory purposes.

See: https://blog.dionresearch.com/2019/10/visualizations-explanatory-exploratory.html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

classgraphic-0.3.1.tar.gz (546.3 kB view hashes)

Uploaded Source

Built Distribution

classgraphic-0.3.1-py3-none-any.whl (20.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page