Skip to main content

Utility Belt for Machine Learning

Project description

BuildTest PythonVersion PyPi_version Downloads License

Utility Library Module for Advanced Machine Learning

Install

pip install wolvr

Example 1: Naive Bayes

from wolvr import naive_bayes as nb
nb.demo(dataset="SMS_spam")

Selected Output:

This demo uses a public dataset of SMS spam, which has a total of 5574 messages = 747 spam and 4827 ham (legitimate).
The goal is to use 'term frequency in message' to predict whether the message is ham (class=0) or spam (class=1).

Using a grid search and a multinomial naive bayes classifier, the best hyperparameters were found as following:
   Step1: Tokenizing text: CountVectorizer(analyzer = 'word', ngram_range = (1, 1));
   Step2: Transforming from occurrences to frequency: TfidfTransformer(use_idf = True).

The top 2 terms with highest probability of a message being a spam (the classification is either spam or ham):
   "claim": 80.73%
   "prize": 80.06%

Application example:
   - Message: "URGENT! We are trying to contact U. Todays draw shows that you have won a 2000 prize GUARANTEED. Call 090 5809 4507 from a landline. Claim 3030. Valid 12hrs only."
   - Probability of class=1 (spam): 98.32%
   - Classification: spam

image_naive_bayes_confusion_matrix

image_naive_bayes_ROC_curve

image_naive_bayes_PR_curve


Example 2: k-Nearest Neighbors

from wolvr import kNN
kNN.demo("Social_Network_Ads")

Selected Output:

This demo uses a public dataset of Social Network Ads, which is used to determine what audience a car company should target in its ads in order to sell a SUV on a social network website.

Using a grid search and a kNN classifier, the best hyperparameters were found as following:
   Step1: scaler: StandardScaler(with_mean=True, with_std=True);
   Step2: classifier: kNN_classifier(n_neighbors=8, weights='uniform', p=1.189207115002721, metric='minkowski').

image_kNN_confusion_matrix

image_kNN_decision_boundary_testing_set

image_kNN_ROC_curve

image_kNN_PR_curve


Example 3: Decision Boundary Comparison

from wolvr import kNN
kNN.demo("Social_Network_Ads")

from wolvr import naive_bayes as nb
nb.demo("Social_Network_Ads")

from wolvr import SVM
SVM.demo("Social_Network_Ads")

image_kNN_decision_boundary_testing_set

image_Gaussian_NB_decision_boundary_testing_set

image_SVM_decision_boundary_testing_set


module: model_evaluation

function

description

plot_confusion_matrix()

plots the confusion matrix, along with key statistics, and returns accuracy

plot_ROC_curve()

plots the ROC (Receiver Operating Characteristic) curve, along with statistics

plot_PR_curve()

plots the precision-recall curve, along with statistics

plot_ROC_and_PR_curves()

plots both the ROC and the precision-recall curves, along with statistics

demo()

provides a demo of the major functions in this module


module: naive_bayes

function

description

naive_bayes_Bernoulli()

when X are independent binary variables (e.g., whether a word occurs in a document or not)

naive_bayes_multinomial()

when X are independent discrete variables with 3+ levels (e.g., term frequency in the document)

naive_bayes_Gaussian()

when X are continuous variables

demo()

provides a demo of selected functions in this module


module: kNN

function

description

demo()

provides a demo of selected functions in this module


module: neural_network

function

description

rnn()

Recurrent neural network

demo()

provides a demo of selected functions in this module


module: decision_tree

function

description

boost()

Boosting

demo()

provides a demo of selected functions in this module

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wolvr-0.0.7.tar.gz (39.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wolvr-0.0.7-py3-none-any.whl (40.0 MB view details)

Uploaded Python 3

File details

Details for the file wolvr-0.0.7.tar.gz.

File metadata

  • Download URL: wolvr-0.0.7.tar.gz
  • Upload date:
  • Size: 39.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for wolvr-0.0.7.tar.gz
Algorithm Hash digest
SHA256 0d9036d05f70d7962b48086ffc2a9e43c4f85b7578804252121a62361fabacc0
MD5 e978843b7c566d99ced0a8c996246ffa
BLAKE2b-256 311c0c126976572bd03aace0b25821187af954362be712f5b57d1edaf0c3dd1a

See more details on using hashes here.

File details

Details for the file wolvr-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: wolvr-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 40.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for wolvr-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 73e86ca33dc9806acb310f45146e1cb72319d1896d47ef21135b1c622169f4ea
MD5 7f745379570bdcfaeec564e0b34c16fd
BLAKE2b-256 b7ea2d80bd86b0c58524b55afd20bcc8b711af44d121577df0beff4883e55b6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page