Skip to main content

Utility Belt for Machine Learning

Project description

BuildTest PythonVersion PyPi_version Downloads License

Utility Library Module for Advanced Machine Learning

Install

pip install wolvr

Example 1: Naive Bayes

from wolvr import naive_bayes as nb
nb.demo(dataset="SMS_spam")

Selected Output:

This demo uses a public dataset of SMS spam, which has a total of 5574 messages = 747 spam and 4827 ham (legitimate).
The goal is to use 'term frequency in message' to predict whether the message is ham (class=0) or spam (class=1).

Using a grid search and a multinomial naive bayes classifier, the best hyperparameters were found as following:
   Step1: Tokenizing text: CountVectorizer(analyzer = 'word', ngram_range = (1, 1));
   Step2: Transforming from occurrences to frequency: TfidfTransformer(use_idf = True).

The top 2 terms with highest probability of a message being a spam (the classification is either spam or ham):
   "claim": 80.73%
   "prize": 80.06%

Application example:
   - Message: "URGENT! We are trying to contact U. Todays draw shows that you have won a 2000 prize GUARANTEED. Call 090 5809 4507 from a landline. Claim 3030. Valid 12hrs only."
   - Probability of class=1 (spam): 98.32%
   - Classification: spam

image_naive_bayes_confusion_matrix

image_naive_bayes_ROC_curve

image_naive_bayes_PR_curve


Example 2: k-Nearest Neighbors

from wolvr import kNN
kNN.demo("Social_Network_Ads")

Selected Output:

This demo uses a public dataset of Social Network Ads, which is used to determine what audience a car company should target in its ads in order to sell a SUV on a social network website.

Using a grid search and a kNN classifier, the best hyperparameters were found as following:
   Step1: scaler: StandardScaler(with_mean=True, with_std=True);
   Step2: classifier: kNN_classifier(n_neighbors=8, weights='uniform', p=1.189207115002721, metric='minkowski').

image_kNN_confusion_matrix

image_kNN_decision_boundary_testing_set

image_kNN_ROC_curve

image_kNN_PR_curve


Example 3: Decision Boundary Comparison

from wolvr import kNN
kNN.demo("Social_Network_Ads")

from wolvr import naive_bayes as nb
nb.demo("Social_Network_Ads")

from wolvr import SVM
SVM.demo("Social_Network_Ads")

image_kNN_decision_boundary_testing_set

image_Gaussian_NB_decision_boundary_testing_set

image_SVM_decision_boundary_testing_set


module: model_evaluation

function

description

plot_confusion_matrix()

plots the confusion matrix, along with key statistics, and returns accuracy

plot_ROC_curve()

plots the ROC (Receiver Operating Characteristic) curve, along with statistics

plot_PR_curve()

plots the precision-recall curve, along with statistics

plot_ROC_and_PR_curves()

plots both the ROC and the precision-recall curves, along with statistics

demo()

provides a demo of the major functions in this module


module: naive_bayes

function

description

naive_bayes_Bernoulli()

when X are independent binary variables (e.g., whether a word occurs in a document or not)

naive_bayes_multinomial()

when X are independent discrete variables with 3+ levels (e.g., term frequency in the document)

naive_bayes_Gaussian()

when X are continuous variables

demo()

provides a demo of selected functions in this module


module: kNN

function

description

demo()

provides a demo of selected functions in this module


module: neural_network

function

description

rnn()

Recurrent neural network

demo()

provides a demo of selected functions in this module


module: decision_tree

function

description

boost()

Boosting

demo()

provides a demo of selected functions in this module

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wolvr-0.0.7.tar.gz (39.9 MB view hashes)

Uploaded Source

Built Distribution

wolvr-0.0.7-py3-none-any.whl (40.0 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page