word predictor
Project description
wordpredict
This is a library that predicts words for ambiguous input.
Installation
pip install wordpredict
How to use
code
import pandas as pd
from wordpredict import WordPredict
corpus = pd.read_csv(
"./unigram_freq.csv",
header=0,
keep_default_na=False,
).values
wp = WordPredict(corpus[:, 0], corpus[:, 1])
print("start user input")
input = ["e", "f", "g", "h"]
print(wp.update(input))
input = ["e", "f", "g", "h"]
print(wp.update(input))
input = ["i", "j", "k", "l"]
print(wp.update(input))
print("reset user input")
wp.reset()
input = ["e", "f", "g", "h"]
print(wp.update(input))
input = ["m", "n", "o", "p"]
print(wp.update(input))
output
start user input
['for', 'e', 'from', 'he', 'has', 'have']
['he', 'get', 'here', 'her', 'help', 'few']
['help', 'held', 'felt', 'hell', 'hello', 'helps']
reset user input
['for', 'e', 'from', 'he', 'has', 'have']
['for', 'home', 'go', 'how', 'good', 'end']
corpus
e.g., https://www.kaggle.com/datasets/rtatman/english-word-frequency
execution time
%%timeit
import pandas as pd
from wordpredict import WordPredict
corpus = pd.read_csv(
"./unigram_freq.csv",
header=0,
keep_default_na=False,
).values
wp = WordPredict(corpus[:, 0], corpus[:, 1])
1.42 s ± 83.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
input = ["e", "f", "g", "h"]
wp.update(input)
input = ["e", "f", "g", "h"]
wp.update(input)
input = ["i", "j", "k", "l"]
wp.update(input)
8.34 ms ± 315 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
note
autocomple was implemented with reference to https://doi.org/10.1145/3173574.3173755
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wordpredict-0.2.0.tar.gz
(2.4 kB
view hashes)
Built Distribution
Close
Hashes for wordpredict-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b10d726ba99ec913c21321d69d5dda1ac734842c7bd3e1977748df43493e9f5 |
|
MD5 | 17b0b73e2333db0662475f25d37518ae |
|
BLAKE2b-256 | f7db4b37415b693835d417d5fb5dec83afd3ce1a249c8f4f558787790e0c441a |