This package provides some Python helper functions that are useful in machine learning.
Project description
helperfns
🎀 This is a python package that contains some helper functions for machine leaning.
Table of Contents
helperfns
- Table of Contents
- Getting started
- Usage
- tables
- text
- utils
- visualization
- Contributing to
helperfns
. - Documentation
- License
Getting started
To start using helperfns
in your project you run the following command:
pip install helperfns
Or if you wan to install it in notebooks such as jupyter notebooks you can run the code cell with the following code:
!pip install helperfns
Usage
The helperfns
package is made up of different sub packages such as:
- tables
- text
- utils
- visualization
tables
In the tables sub package you can print your data in tabular form for example:
from helperfns.tables import tabulate_data
column_names = ["SUBSET", "EXAMPLE(s)", "Hello"]
row_data = [["training", 5, 4],['validation', 4, 4],['test', 3, '']]
tabulate_data(column_names, row_data)
Output:
Table
+------------+------------+-------+
| SUBSET | EXAMPLE(s) | Hello |
+------------+------------+-------+
| training | 5 | 4 |
| validation | 4 | 4 |
| test | 3 | |
+------------+------------+-------+
The following is the table of arguments for the tabulate_data
helper function
Argument | Description | Type |
---|---|---|
column_names |
List of column names | list |
data |
Data to be tabulated | list |
title |
Title of the table | str |
text
The text package offers two main function which are clean_sentence
, de_contract
, generate_ngrams
and generate_bigrams
from helperfns.text import *
# cleans the sentence
print(clean_sentence("text 1 # https://url.com/bla1/blah1/"))
Here is the table of arguments for the clean_sentence
helper function.
Argument | Description | Type |
---|---|---|
sent |
Input sentence | str |
lower |
Flag to convert to lower case (default: True) | bool |
You can get the list of english words as follows:
# list of all english words
print(english_words)
You can use the de_contract
to de-contact strings as follows
# converts strings like `I'm` to 'I am'
print(de_contract("I'm"))
Here is the table of arguments for the de_contract
function.
Argument | Description | Type |
---|---|---|
word |
Word to de-contract | str |
The generate_bigrams
is responsible for generating bi grams from list of words. Here is how you can use the function
# generate bigrams from a list of word
print(text.generate_bigrams(['This', 'film', 'is', 'terrible']))
Here is the table of arguments for the generate_bigrams
function:
Argument | Description | Type |
---|---|---|
x |
List of input elements | list |
The generate_ngrams
generate the n-grams from a list of words, here is an example on how you can use this function
# generates n-grams from a list of words
print(text.generate_ngrams(['This', 'film', 'is', 'terrible']))
Here is the table of arguments for the generate_ngrams
function:
Argument | Description | Type |
---|---|---|
x |
List of input elements | list |
grams |
Number of grams for generating n-grams (default: 3) | int |
utils
utils package comes with a simple helper function for converting seconds to hours, minutes and seconds.
Example:
from helperfns.utils import hms_string
start = time.time()
for i in range(100000):
pass
end = time.time()
print(hms_string(end - start))
Output:
'0:00:00.01'
The hms_string
takes in the following as arguments.
Argument | Description | Type |
---|---|---|
sec_elapsed |
Time in seconds to be converted | Any |
visualization
This sub package provides different helper functions for visualizing data using plots.
Examples:
The following code cell will plot a classification report of true labels versus predicted labels.
from helperfns.visualization import plot_complicated_confusion_matrix, plot_images, plot_images_predictions, plot_simple_confusion_matrix,
plot_classification_report
# plotting classification report
fig, ax = plot_classification_report(labels, preds,
title='Classification Report',
figsize=(10, 5), dpi=70,
target_names = classes)
The plot_classification_report
takes the following arguments:
Argument | Description | Type |
---|---|---|
y_true |
True labels | list |
y_pred |
Predicted labels | list |
title |
Title of the plot (default: "Classification Report") | str |
figsize |
Size of the figure (default: (10, 5)) | tuple |
dpi |
Resolution of the figure (default: 70) | int |
save_fig_path |
Path to save the figure (default: None) | Any or None |
**kwargs | Additional keyword arguments | Any |
The plot_images_predictions
plots the image predictions. This functions is very useful when you are doing image classification.
# plot predicted image labels with the images
plot_images_predictions(images, true_labels, preds, classes=["dog", "cat"] ,cols=8)
Here is the table of arguments for the plot_images_predictions
.
Argument | Description | Type |
---|---|---|
images |
List of images to plot | list |
labels_true |
True labels | list |
labels_pred |
Predicted labels | list |
classes |
List of class labels (default: []) | list |
cols |
Number of columns in the plot (default: 5) | int |
rows |
Number of rows in the plot (default: 3) | int |
fontsize |
Font size for labels (default: 16) | int |
The plot_images
functions is used to visualize images.
# plot the images with their labels
plot_images(images[:24], true_labels[:24], cols=8)
The plot_images
takes the following as arguments:
Argument | Description | Type |
---|---|---|
images |
List of images to plot | list |
labels |
List of labels corresponding to images | list |
cols |
Number of columns in the plot (default: 5) | int |
rows |
Number of rows in the plot (default: 3) | int |
fontsize |
Font size for labels (default: 16) | int |
The plot_simple_confusion_matrix
is used to plot a less more verbose confusion matrix of real labels against predicted labels.
# plot a simple confusion matrix
y_true = [random.randint(0, 1) for _ in range (100)]
y_pred = [random.randint(0, 1) for _ in range (100)]
classes =["dog", "cat"]
plot_simple_confusion_matrix(y_true, y_pred, classes)
This function takes in the following in the following as arguments.
Argument | Description | Type |
---|---|---|
y_true |
True labels | list |
y_pred |
Predicted labels | list |
classes |
List of class labels (default: []) | list |
figsize |
Size of the figure (default: (10, 10)) | tuple |
fontsize |
Font size for labels (default: 15) | int |
The plot_complicated_confusion_matrix
is used to plot a more verbose confusion matrix of real labels against predicted labels.
# plot a confusion matrix with percentage value of confusion
y_true = [random.randint(0, 1) for _ in range (100)]
y_pred = [random.randint(0, 1) for _ in range (100)]
classes =["dog", "cat"]
plot_complicated_confusion_matrix(y_true, y_pred, classes)
This function takes in the following as arguments.
Argument | Description | Type |
---|---|---|
y_true |
True labels | list |
y_pred |
Predicted labels | list |
classes |
List of class labels (default: []) | list |
figsize |
Size of the figure (default: (5, 5)) | tuple |
fontsize |
Font size for labels (default: 20) | int |
title |
Title of the plot (default: "Confusion Matrix") | str |
xlabel |
Label for x-axis (default: "Predicted label") | str |
ylabel |
Label for y-axis (default: "True label") | str |
The plot_wordcloud
function generates and plots a word cloud based on the provided corpus.
# Generate a word cloud from a sample text
corpus = "This is a sample text for generating word clouds"
plot_wordcloud(corpus, max_words=500, mask="wine")
This function takes in the following as arguments.
Argument | Description | Type |
---|---|---|
corpus |
The text or dictionary of word frequencies to generate the word cloud from. | str or dict |
max_words |
Maximum number of words to include in the word cloud, default is 1,000. | int |
title |
Title of the plot, default is "Word Cloud". | str |
mask |
The shape mask for the word cloud. Options are "head", "chicken", "wine", "apple", "tree" or None, default is "tree". | Union[Literal["head", "chicken", "wine", "apple", "tree"], None] |
background_color |
The background color of the word cloud, default is "#E4E0E1". | str |
contour_width |
Width of the contour around the word cloud, default is 1. | int |
contour_color |
Color of the contour around the word cloud, default is "#D6C0B3". | str |
figsize |
The figure size of the word cloud plot, default is (10, 10). | tuple |
fontsize |
Font size for the plot title, default is 15. | int |
save_path |
The path to save the plotted figure (default: None). | str or None |
Contributing to helperfns
.
To contribute to helperfns
read the CONTRIBUTION.md file.
Documentation
You can read the full documentation here.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file helperfns-1.2.1.tar.gz
.
File metadata
- Download URL: helperfns-1.2.1.tar.gz
- Upload date:
- Size: 266.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84e5e7284d64c02560a11c4c5cac8fa573944d5223d60c66b7dc02efc67713ed |
|
MD5 | 41bcddad2aba5c0d6ec2ded9de0878c8 |
|
BLAKE2b-256 | 54f76f404afb7bd935582bf1290c18b0d43f61ee5d592d2ae2c0036cc828b4c8 |
File details
Details for the file helperfns-1.2.1-py3-none-any.whl
.
File metadata
- Download URL: helperfns-1.2.1-py3-none-any.whl
- Upload date:
- Size: 186.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf15a65a9a03b2565610a5bdc8677cb0a64ef4b109d4a20d7d2c7bb16c3ca885 |
|
MD5 | a49f1cd55c5cb9f26868d5e43ceda007 |
|
BLAKE2b-256 | b54d87ec60971cb6729ff4685760fbca19b41a07867095c8396a941a346a4ff4 |