A package to make NLP easy, fast, and fun
Project description
Subclip 0.0.2
A package to make NLP fast and easy for beginners.
- Efficient text prediction
- Text pairing, equivalent to that of NLTK's n-gram.
- Syllable Identification
- Find frequencies of words in given text
- Find matching words in two arrays
I still have a lot of plans for this package, for that reason, there would be a lot of frequent updates in the near future. The updates would include optimizations & more functions, so stay tuned.
Install
pip install subclip
Usage
First import the program using:
import subclip
Predict
A function that predicts the next x number of words based on the given string and phrase
Parameters
The function's parameters are:
predict(string, phrase, n=0, case_insensitive=False)
- String: Main text
- Phrase: The key phrase (prompt). The function would try to predict what would come after the given phrase.
- n: The number of words it would return. It's automomatically set to 0, which would return all predictions regardless of their corresponding word counts.
- case_insensitive: Set this to
Trueif you want to.
Actual usage
So, let's try to use this.
string="I am a string. I am also a human being, but most importantly, I am a string."
print(predict(string, "I am", n=1))
This would output
{'a': 2, 'also': 1}
But, if you change the n value,
print(predict(string, "I am", n=2))
It would output
{'a string.': 2, 'also a': 1}
Pair
This function splits a string into pairs of strings.
Parameters
pair(string, n)
- string is the string you're trying to split into pairs
- n stands for the number of strings in each pair. (Equivalent to that of the
nvalue in n-gram)
Usage
Let's set our string to:
string="Sometimes, I just go out and eat sand. I don't know why"
Don't ask. Let's turn this into pairs of 2:
print(pair(string, 2))
Which outputs
[['Sometimes,', 'I'], ['I', 'just'], ['just', 'go'], ['go', 'out'], ['out', 'and'], ['and', 'eat'], ['eat', 'sand.'], ['sand.', 'I'], ['I', "don't"], ["don't", 'know'], ['know', 'why']]
Identify Syllables
subclip.syllables("carbonmonoxide")
This outputs:
car-bon-mon-ox-ide
But take note that this only works with lowercase strings.
Countwords
Parameters
The function's parameters are:
countwords(string, case_insensitive=False)
Change that to True if you want it to be case-insensitive.
Actual usage
Get yourself a nice string
string = "Sometimes I wonder, 'Am I stupid?' then I realize, yeah. yeah, I am stupid."
Then put it in the function:
x = subclip.countwords(string)
print(x)
It should print:
{'I': 4, 'Sometimes': 1, 'wonder,': 1, "'Am": 1, "stupid?'": 1, 'then': 1, 'realize,': 1, 'yeah.': 1, 'yeah,': 1, 'am': 1, 'stupid.': 1}
Matchingwords
A function that finds & counts matching words in two strings
Actual usage
So in this case, our strings are:
string1, string2 = "God, I love drawing, drawing is my favourite thing to do", "God, I hate drawing, drawing is my least favourite thing to do"
If we run this through matchingwords, we would get:
{'God,': 1, 'I': 1, 'drawing,': 1, 'drawing': 1, 'is': 1, 'my': 1, 'favourite': 1, 'thing': 1, 'to': 1, 'do': 1}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file Subclip-0.0.2.tar.gz.
File metadata
- Download URL: Subclip-0.0.2.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a92c10a508b42bba32364433092f7554667ccb5a950c8d2f2b9cf5942044465
|
|
| MD5 |
7562ec35d7b76c93c645e158b1fe863d
|
|
| BLAKE2b-256 |
d7ef6c2fd42535a89e9c092d0c5efd4763a637d7a2baae3e896b7bec474d8fa5
|
File details
Details for the file Subclip-0.0.2-py3-none-any.whl.
File metadata
- Download URL: Subclip-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ee838a00aaf560ddf64d0844d3e27758098ffa729813f3da5a9b40472284c5f
|
|
| MD5 |
fea4e116e276e4aed6ac1c654f73c0a1
|
|
| BLAKE2b-256 |
4282f8a8ea0f6c678663a1cdeefc23319aa0ba5042480fc043de14a35a6256d2
|