This package provides some Python helper functions that are useful in machine learning.
Project description
helperfns
🎀 This is a python package that contains some helper functions for machine leaning.
Table of Contents
helperfns- Table of Contents
- Getting started
- Usage
- tables
- text
- utils
- visualization
- Contributing to
helperfns. - Documentation
- License
Getting started
To start using helperfns in your project you run the following command:
pip install helperfns
Or if you wan to install it in notebooks such as jupyter notebooks you can run the code cell with the following code:
!pip install helperfns
Usage
The helperfns package is made up of different sub packages such as:
- tables
- text
- utils
- visualization
tables
In the tables sub package you can print your data in tabular form for example:
from helperfns.tables import tabulate_data
column_names = ["SUBSET", "EXAMPLE(s)", "Hello"]
row_data = [["training", 5, 4],['validation', 4, 4],['test', 3, '']]
tabulate_data(column_names, row_data)
Output:
Table
+------------+------------+-------+
| SUBSET | EXAMPLE(s) | Hello |
+------------+------------+-------+
| training | 5 | 4 |
| validation | 4 | 4 |
| test | 3 | |
+------------+------------+-------+
The following is the table of arguments for the tabulate_data helper function
| Argument | Description | Type |
|---|---|---|
column_names |
List of column names | list |
data |
Data to be tabulated | list |
title |
Title of the table | str |
text
The text package offers two main function which are clean_sentence, de_contract, generate_ngrams and generate_bigrams
from helperfns.text import *
# cleans the sentence
print(clean_sentence("text 1 # https://url.com/bla1/blah1/"))
Here is the table of arguments for the clean_sentence helper function.
| Argument | Description | Type |
|---|---|---|
sent |
Input sentence | str |
lower |
Flag to convert to lower case (default: True) | bool |
You can get the list of english words as follows:
# list of all english words
print(english_words)
You can use the de_contract to de-contact strings as follows
# converts strings like `I'm` to 'I am'
print(de_contract("I'm"))
Here is the table of arguments for the de_contract function.
| Argument | Description | Type |
|---|---|---|
word |
Word to de-contract | str |
The generate_bigrams is responsible for generating bi grams from list of words. Here is how you can use the function
# generate bigrams from a list of word
print(text.generate_bigrams(['This', 'film', 'is', 'terrible']))
Here is the table of arguments for the generate_bigrams function:
| Argument | Description | Type |
|---|---|---|
x |
List of input elements | list |
The generate_ngrams generate the n-grams from a list of words, here is an example on how you can use this function
# generates n-grams from a list of words
print(text.generate_ngrams(['This', 'film', 'is', 'terrible']))
Here is the table of arguments for the generate_ngrams function:
| Argument | Description | Type |
|---|---|---|
x |
List of input elements | list |
grams |
Number of grams for generating n-grams (default: 3) | int |
utils
utils package comes with a simple helper function for converting seconds to hours, minutes and seconds.
Example:
from helperfns.utils import hms_string
start = time.time()
for i in range(100000):
pass
end = time.time()
print(hms_string(end - start))
Output:
'0:00:00.01'
The hms_string takes in the following as arguments.
| Argument | Description | Type |
|---|---|---|
sec_elapsed |
Time in seconds to be converted | Any |
visualization
This sub package provides different helper functions for visualizing data using plots.
Examples:
The following code cell will plot a classification report of true labels versus predicted labels.
from helperfns.visualization import plot_complicated_confusion_matrix, plot_images, plot_images_predictions, plot_simple_confusion_matrix,
plot_classification_report
# plotting classification report
fig, ax = plot_classification_report(labels, preds,
title='Classification Report',
figsize=(10, 5), dpi=70,
target_names = classes)
The plot_classification_report takes the following arguments:
| Argument | Description | Type |
|---|---|---|
y_true |
True labels | list |
y_pred |
Predicted labels | list |
title |
Title of the plot (default: "Classification Report") | str |
figsize |
Size of the figure (default: (10, 5)) | tuple |
dpi |
Resolution of the figure (default: 70) | int |
save_fig_path |
Path to save the figure (default: None) | Any or None |
| **kwargs | Additional keyword arguments | Any |
The plot_images_predictions plots the image predictions. This functions is very useful when you are doing image classification.
# plot predicted image labels with the images
plot_images_predictions(images, true_labels, preds, classes=["dog", "cat"] ,cols=8)
Here is the table of arguments for the plot_images_predictions.
| Argument | Description | Type |
|---|---|---|
images |
List of images to plot | list |
labels_true |
True labels | list |
labels_pred |
Predicted labels | list |
classes |
List of class labels (default: []) | list |
cols |
Number of columns in the plot (default: 5) | int |
rows |
Number of rows in the plot (default: 3) | int |
fontsize |
Font size for labels (default: 16) | int |
The plot_images functions is used to visualize images.
# plot the images with their labels
plot_images(images[:24], true_labels[:24], cols=8)
The plot_images takes the following as arguments:
| Argument | Description | Type |
|---|---|---|
images |
List of images to plot | list |
labels |
List of labels corresponding to images | list |
cols |
Number of columns in the plot (default: 5) | int |
rows |
Number of rows in the plot (default: 3) | int |
fontsize |
Font size for labels (default: 16) | int |
The plot_simple_confusion_matrix is used to plot a less more verbose confusion matrix of real labels against predicted labels.
# plot a simple confusion matrix
y_true = [random.randint(0, 1) for _ in range (100)]
y_pred = [random.randint(0, 1) for _ in range (100)]
classes =["dog", "cat"]
plot_simple_confusion_matrix(y_true, y_pred, classes)
This function takes in the following in the following as arguments.
| Argument | Description | Type |
|---|---|---|
y_true |
True labels | list |
y_pred |
Predicted labels | list |
classes |
List of class labels (default: []) | list |
figsize |
Size of the figure (default: (10, 10)) | tuple |
fontsize |
Font size for labels (default: 15) | int |
The plot_complicated_confusion_matrix is used to plot a more verbose confusion matrix of real labels against predicted labels.
# plot a confusion matrix with percentage value of confusion
y_true = [random.randint(0, 1) for _ in range (100)]
y_pred = [random.randint(0, 1) for _ in range (100)]
classes =["dog", "cat"]
plot_complicated_confusion_matrix(y_true, y_pred, classes)
This function takes in the following as arguments.
| Argument | Description | Type |
|---|---|---|
y_true |
True labels | list |
y_pred |
Predicted labels | list |
classes |
List of class labels (default: []) | list |
figsize |
Size of the figure (default: (5, 5)) | tuple |
fontsize |
Font size for labels (default: 20) | int |
title |
Title of the plot (default: "Confusion Matrix") | str |
xlabel |
Label for x-axis (default: "Predicted label") | str |
ylabel |
Label for y-axis (default: "True label") | str |
The plot_wordcloud function generates and plots a word cloud based on the provided corpus.
# Generate a word cloud from a sample text
corpus = "This is a sample text for generating word clouds"
plot_wordcloud(corpus, max_words=500, mask="wine")
This function takes in the following as arguments.
| Argument | Description | Type |
|---|---|---|
corpus |
The text or dictionary of word frequencies to generate the word cloud from. | str or dict |
max_words |
Maximum number of words to include in the word cloud, default is 1,000. | int |
title |
Title of the plot, default is "Word Cloud". | str |
mask |
The shape mask for the word cloud. Options are "head", "chicken", "wine", "apple", "tree" or None, default is "tree". | Union[Literal["head", "chicken", "wine", "apple", "tree"], None] |
background_color |
The background color of the word cloud, default is "#E4E0E1". | str |
contour_width |
Width of the contour around the word cloud, default is 1. | int |
contour_color |
Color of the contour around the word cloud, default is "#D6C0B3". | str |
figsize |
The figure size of the word cloud plot, default is (10, 10). | tuple |
fontsize |
Font size for the plot title, default is 15. | int |
save_path |
The path to save the plotted figure (default: None). | str or None |
Contributing to helperfns.
To contribute to helperfns read the CONTRIBUTION.md file.
Documentation
You can read the full documentation here.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file helperfns-1.2.1.tar.gz.
File metadata
- Download URL: helperfns-1.2.1.tar.gz
- Upload date:
- Size: 266.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84e5e7284d64c02560a11c4c5cac8fa573944d5223d60c66b7dc02efc67713ed
|
|
| MD5 |
41bcddad2aba5c0d6ec2ded9de0878c8
|
|
| BLAKE2b-256 |
54f76f404afb7bd935582bf1290c18b0d43f61ee5d592d2ae2c0036cc828b4c8
|
File details
Details for the file helperfns-1.2.1-py3-none-any.whl.
File metadata
- Download URL: helperfns-1.2.1-py3-none-any.whl
- Upload date:
- Size: 186.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf15a65a9a03b2565610a5bdc8677cb0a64ef4b109d4a20d7d2c7bb16c3ca885
|
|
| MD5 |
a49f1cd55c5cb9f26868d5e43ceda007
|
|
| BLAKE2b-256 |
b54d87ec60971cb6729ff4685760fbca19b41a07867095c8396a941a346a4ff4
|