Interactive classification diagnostic plots
Project description
classgraphic
Interactive classification diagnostic plots for scikit-learn.
We classify things for the purpose of doing something to them. Any classification which does not assist manipulation is worse than useless. - Randolph S. Bourne, "Education and Living", The Century Co (April 1917)
Major features:
Plotly based tables for:
- class_imbalance_table
- classification_table
- confusion_matrix_table
- describe (dataframe stats)
- prediction_table
- table
And the following charts:
- class_imbalance
- class_error
- det
- feature_importance
- missing
- precision_recall
- roc
- prediction_histogram
- threshold
For clustering:
- Delauney triangulations
- Voronoi tessalations
Try it
By trying it on binder, you'll see all the details and interactivity. The quickstart below has static images, but if you run these commands in a jupyter notebook, ipython or IDE you will be able to interact with them.
Quickstart
from classgraphic.essential import *
# loading the data
df = px.data.iris()
# let's see what kind of data we have
describe(df, transpose=True).show()
# any missing?
missing(df)
# features
X = df.drop(columns=["species", "species_id"])
#target
y = df["species"]
# Let's check our classes we will be training on and predicting
class_imbalance_table(y, condition="all")
# train / test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=random_state
)
# we want to see total count for each, default for bars is to be stacked, so that works
# we could also pass to class_imbalance barmode="overlay" if we prefer
class_imbalance(y_train, y_test, condition="train,test")
# model
model = LogisticRegression(max_iter=max_iter, random_state=random_state)
model.fit(X_train, y_train)
# predictions
y_score = model.predict_proba(X_test)
y_pred = model.predict(X_test)
confusion_matrix_table(model, y_test, y_pred).show()
classification_table(model, y_test, y_pred)
feature_importance(model, y, transpose=True)
This concludes the quickstart. There are many more visualizations and tables to explore.
See the notebooks
and docs
folders on github and the documentation
web site for more information.
Requirements
- Python 3.8 or later
- numpy
- pandas
- plotly>=5.0
- scikit-learn
- nbformat
Install
If you use conda, create an environment named classgraphic
, then activate it:
-
in Linux:
source activate pilot
-
In Windows:
conda activate pilot
If you use another environment management create and activate your environment using the normal steps.
Then execute:
python setup.py install
or for installing in development mode:
python -m pip install -e . --no-build-isolation
or alternatively
python setup.py develop
To install from github instead:
pip install git+https://github.com/dionresearch/classgraphic
See also
- stemgraphic python package for visualization of data and text
- Hotelling one and two sample Hotelling T2 tests, T2 and f statistics and univariate and multivariate control charts and anomaly detection
History
0.3.1 (2023-09-20)
- bugfix for describe with pandas 2.x
0.3.0 (2023-05-01)
- added 2D clustering visualization
- defaults to Voronoi tessalation
- optional Delauney triangulation
0.2.1 (2022-09-20)
- fixed image not showing on pypi
- fixed feature importance error
- warning = False didn't prevent warning to show
0.2.1 (2022-09-19)
- added binary classification notebook example
- fixed issue with non dataframe binary classification
0.2.0 (2022-09-18)
The previous version was a first step to doing a public release. This release added:
- documented
- updated the code to be in line with plotly 5.x
It was released to github and pypy.
0.1.0 (2019-10-27)
- First private release
Origins
Inspired by Dion Research LLC Internal EDA/anomaly and end to end data science platform. A dozen charts and tables were initially designed to provide better diagnostic reporting. Some can also be used for exploratory or explanatory purposes.
See: https://blog.dionresearch.com/2019/10/visualizations-explanatory-exploratory.html
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file classgraphic-0.3.1.tar.gz
.
File metadata
- Download URL: classgraphic-0.3.1.tar.gz
- Upload date:
- Size: 546.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75803f1e09d4990661d03e08c0ce73278782e08835da8970f24ea07012f029d5 |
|
MD5 | 6694ffabda467806d6bd9880689bd1df |
|
BLAKE2b-256 | 8d953cd340432d090a24c87c3c8429534fd4485419f4dab0f368a6331b973247 |
File details
Details for the file classgraphic-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: classgraphic-0.3.1-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0e5a9cd816845f583d156a11b4b8f1782b694ba85b9b1c08941db3c777a127b |
|
MD5 | 6c9ee27529b57e210599291b2c69dc6b |
|
BLAKE2b-256 | e11ee16382d15f79f855f6f9c75263dc93535daccb9c5cdea71f549001a4c25c |