Python package that implements the SS3 text classifier (with visualizations tools for XAI)

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

:sparkles: A python package implementing a novel text classifier with visualization tools for Explainable AI :sparkles:

The SS3 text classifier is a novel supervised machine learning model for text classification. SS3 was originally introduced in Section 3 of the paper "A text classification framework for simple and effective early depression detection over social media streams" (preprint available here).

Some virtues of SS3:

It has the ability to naturally explain its rationale.
It is robust to the class-imbalance problem since it learns a (special kind of) language model for each class (making the relative difference in the number of documents among classes irrelevant).
Naturally supports both, multinomial and multi-label classification.
Naturally supports incremental (online) learning and incremental classification.
Well suited for classification over text streams.
It is not an "obscure" model since it only has 3 semantically well-defined hyperparameters which are easy-to-understand.

Note: this package also incorporates different variations of the SS3 classifier, such as the one introduced in "t-SS3: a text classifier with dynamic n-grams for early risk detection over text streams " (recently submitted to Pattern Recognition Letters, preprint available here) which allows SS3 to recognize important word n-grams "on the fly".

What is PySS3?

PySS3 is a Python package that allows you to work with SS3 in a very straightforward, interactive and visual way. In addition to the implementation of the SS3 classifier, PySS3 comes with a set of tools to help you developing your machine learning models in a clearer and faster way. These tools let you analyze, monitor and understand your models by allowing you to see what they have actually learned and why. To achieve this, PySS3 provides you with 3 main components: the SS3 class, the Live_Test class, and the Evaluation class, as pointed out below.

:point_right: The `SS3` class

which implements the classifier using a clear API (very similar to that of sklearn's models):

    from pyss3 import SS3
    clf = SS3()
    ...
    clf.fit(x_train, y_train)
    y_pred = clf.predict(x_test)

Also, this class provides a handful of other useful methods, such as, for instance, classify_multilabel to provide multi-label classification support:

    doc = "Liverpool CEO Peter Moore on Building a Global Fanbase"

    # standard "single-label" classification
    label = clf.classify_label(doc) # 'business'

    # multi-label classification
    labels = clf.classify_multilabel(doc)  # ['business', 'sports']

or extract_insight to allow you to extract the text fragments involved in the classification decision.

:point_right: The `Live_Test` class

which allows you to interactively test your model and visually see the reasons behind classification decisions, with just one line of code:

    from pyss3.server import Live_Test
    from pyss3 import SS3

    clf = SS3(name="my_model")
    ...
    clf.fit(x_train, y_train)
    Live_Test.run(clf, x_test, y_test) # <- this one! cool uh? :)

As shown in the image below, this will open up, locally, an interactive tool in your browser which you can use to (live) test your models with the documents given in x_test (or typing in your own!). This will allow you to visualize and understand what your model is actually learning.

For example, we have uploaded two of these live tests online for you to try out: "Movie Review (Sentiment Analysis)" and "Topic Categorization", both were obtained following the tutorials.

:point_right: And last but not least, the `Evaluation` class

This is probably one of the most useful components of PySS3. As the name may suggest, this class provides the user easy-to-use methods for model evaluation and hyperparameter optimization, like, for example, the test, kfold_cross_validation, grid_search, and plot methods for performing tests, stratified k-fold cross validations, grid searches for hyperparameter optimization, and visualizing evaluation results using an interactive 3D plot, respectively. Probably one of its most important features is the ability to automatically (and permanently) record the history of evaluations that you've performed. This will save you a lot of time and will allow you to interactively visualize and analyze your classifier performance in terms of its different hyper-parameters values (and select the best model according to your needs). For instance, let's perform a grid search with a 4-fold cross-validation on the three hyperparameters, smoothness(s), significance(l), and sanction(p):

from pyss3.util import Evaluation
...
best_s, best_l, best_p, _ = Evaluation.grid_search(
    clf, x_train, y_train,
    s=[0.2 , 0.32, 0.44, 0.56, 0.68, 0.8],
    l=[0.1 , 0.48, 0.86, 1.24, 1.62, 2],
    p=[0.5, 0.8, 1.1, 1.4, 1.7, 2],
    k_fold=4
)

In this illustrative example, s, l, and p will take those 6 different values each, and once the search is over, this function will return (by default) the hyperparameter values that obtained the best accuracy. Now, we could also use the plot function to analyze the results obtained in our grid search using the interactive 3D evaluation plot:

Evaluation.plot()

In this 3D plot, each point represents an experiment/evaluation performed using that particular combination of values (s, l, and p). Also, these points are painted proportional to how good the performance was using that configuration of the model. Researchers can interactively change the evaluation metrics to be used (accuracy, precision, recall, f1, etc.) and plots will update "on the fly". Additionally, when the cursor is moved over a data point, useful information is shown (including a "compact" representation of the confusion matrix obtained in that experiment). Finally, it is worth mentioning that, before showing the 3D plots, PySS3 creates a single and portable HTML file in your project folder containing the interactive plots. This allows researchers to store, send or upload the plots to another place using this single HTML file (or even provide a link to this file in their own papers, which would be nicer for readers, plus it would increase experimentation transparency). For example, we have uploaded two of these files for you to see: "Movie Review (Sentiment Analysis)" and "Topic Categorization", both evaluation plots were also obtained following the tutorials.

The PySS3 Workflow :computer:

PySS3 provides two main types of workflow: classic and "command-line". Both workflows are briefly described below.

Classic

As usual, importing the needed classes and functions from the package, the user writes a python script to train and test the classifiers. In this workflow, user can use the PySS3 Command Line tool to perform model selection (though hyperparameter optimization).

Command-Line

When you install the package (for instance by using pip install pyss3) a new command pyss3 is automatically added to your environment's command line. This command allows you to access to the PySS3 Command Line, an interactive command-line query tool. This workflow consist of using this tool to carry out the whole machine learning pipeline (model selection, training, testing, etc.), which provides a faster way to perform experimentations since the user doesn't have to write any python script. Plus, this Command Line tool allows the user to actively interact "on the fly" with the models being developed.

Note: tutorials are presented in two versions, one for each workflow type, so that the reader can choose the workflow that best suit her/his needs.

Want to give PySS3 a try? :eyeglasses: :coffee:

Just go to the Getting Started page :D

Installation

Simply use:

pip install pyss3

Or, if you already have installed an old version, update it with:

pip install --upgrade pyss3

Want to contribute to this Open Source project? :sparkles::octocat::sparkles:

Thanks for your interest in the project, you're Awesome !! Any kind of help is very welcome (Code, Bug reports, Content, Data, Documentation, Design, Examples, Ideas, Feedback, etc.), Issues and/or Pull Requests are welcome for any level of improvement, from a small typo to new features, help us make PySS3 better :+1:

Remember that you can use the "Edit" button ('pencil' icon) up the top to edit any file of this repo directly on GitHub.

Also, if you star this repo (:star2:), you would be helping PySS3 to gain more visibility and reach the hands of people who may find it useful since repository lists and search results are usually ordered by the total number of stars.

Finally, in case you're planning to create a new Pull Request, for committing to this repo, we follow the "seven rules of a great Git commit message" from "How to Write a Git Commit Message", so make sure your commits follow them as well.

(please do not hesitate to send me an email to sergio.burdisso@gmail.com for anything)

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.6.4

Jan 30, 2021

0.6.3

Jul 17, 2020

0.6.2

Jun 19, 2020

0.6.1

May 27, 2020

0.6.0

May 24, 2020

0.5.9

May 8, 2020

0.5.8

May 5, 2020

0.5.7

May 5, 2020

0.5.6

Mar 30, 2020

0.5.5

Mar 2, 2020

0.5.4

Feb 27, 2020

0.5.3

Feb 27, 2020

0.5.2

Feb 26, 2020

0.5.1

Feb 25, 2020

This version

0.5.0

Feb 25, 2020

0.4.1

Feb 16, 2020

0.4.0

Feb 12, 2020

0.3.9.2

Jan 20, 2020

0.3.9.1

Jan 20, 2020

0.3.9

Nov 27, 2019

0.3.8

Nov 25, 2019

0.3.7

Nov 22, 2019

0.3.6

Nov 15, 2019

0.3.5

Nov 12, 2019

0.3.4

Nov 12, 2019

0.3.3

Nov 12, 2019

0.3.2

Nov 12, 2019

0.3.0

Nov 11, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyss3-0.5.0.tar.gz (116.2 kB view hashes)

Uploaded Feb 25, 2020 Source

Built Distribution

pyss3-0.5.0-py3-none-any.whl (2.0 MB view hashes)

Uploaded Feb 25, 2020 Python 3

Hashes for pyss3-0.5.0.tar.gz

Hashes for pyss3-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`c4a30f976b98a539e1a0836e3559873e458e893ede5c341bdedec410f06ce3ef`
MD5	`c950c55a93535200ff3044e701fa0e64`
BLAKE2b-256	`c42e3031b00d0b0d42736e37a4f5379cddb082d742b70af69d06c290b80ca184`

Hashes for pyss3-0.5.0-py3-none-any.whl

Hashes for pyss3-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b70a572dbc7da0540063b9cfef9a5bed9542d0967ad103075580a89b123e5ac0`
MD5	`9fff10043b846c0bd10fed1093f750ec`
BLAKE2b-256	`7bc9ca28cd36d094a568ff532d7005bf68be37fe51ca5d52509a4f00bfb1e948`

pyss3 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

:sparkles: A python package implementing a novel text classifier with visualization tools for Explainable AI :sparkles:

What is PySS3?

:point_right: The `SS3` class

:point_right: The `Live_Test` class

:point_right: And last but not least, the `Evaluation` class

The PySS3 Workflow :computer:

Classic

Command-Line

Want to give PySS3 a try? :eyeglasses: :coffee:

Installation

Want to contribute to this Open Source project? :sparkles::octocat::sparkles:

Further Readings :scroll:

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

pyss3 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

:sparkles: A python package implementing a novel text classifier with visualization tools for Explainable AI :sparkles:

What is PySS3?

:point_right: The SS3 class

:point_right: The Live_Test class

:point_right: And last but not least, the Evaluation class

The PySS3 Workflow :computer:

Classic

Command-Line

Want to give PySS3 a try? :eyeglasses: :coffee:

Installation

Want to contribute to this Open Source project? :sparkles::octocat::sparkles:

Further Readings :scroll:

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

:point_right: The `SS3` class

:point_right: The `Live_Test` class

:point_right: And last but not least, the `Evaluation` class