word predictor
Project description
wordpredict
This is a library that predicts words for ambiguous input.
Installation
pip install wordpredict
How to use
code
import pandas as pd
from wordpredict import WordPredict
corpus = pd.read_csv(
"./unigram_freq.csv",
header=0,
keep_default_na=False,
).values
wp = WordPredict(corpus[:, 0], corpus[:, 1])
print("start user input")
input = ["e", "f", "g", "h"]
print(wp.update(input))
input = ["e", "f", "g", "h"]
print(wp.update(input))
input = ["i", "j", "k", "l"]
print(wp.update(input))
print("reset user input")
wp.reset()
input = ["e", "f", "g", "h"]
print(wp.update(input))
input = ["m", "n", "o", "p"]
print(wp.update(input))
output
start user input
['for', 'e', 'from', 'he', 'has', 'have']
['he', 'get', 'here', 'her', 'help', 'few']
['help', 'held', 'felt', 'hell', 'hello', 'helps']
reset user input
['for', 'e', 'from', 'he', 'has', 'have']
['for', 'home', 'go', 'how', 'good', 'end']
corpus
e.g., https://www.kaggle.com/datasets/rtatman/english-word-frequency
execution time
%%timeit
import pandas as pd
from wordpredict import WordPredict
corpus = pd.read_csv(
"./unigram_freq.csv",
header=0,
keep_default_na=False,
).values
wp = WordPredict(corpus[:, 0], corpus[:, 1])
1.42 s ± 83.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
input = ["e", "f", "g", "h"]
wp.update(input)
input = ["e", "f", "g", "h"]
wp.update(input)
input = ["i", "j", "k", "l"]
wp.update(input)
8.34 ms ± 315 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
note
autocomple was implemented with reference to https://doi.org/10.1145/3173574.3173755
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wordpredict-0.2.2.tar.gz
(2.5 kB
view hashes)
Built Distribution
Close
Hashes for wordpredict-0.2.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 771380f70317dc25b23934544ff0fa782174e4654eef2fef5d8066306993b38f |
|
MD5 | d0cd12831f09a7198c0bfe67a87c8c06 |
|
BLAKE2b-256 | fa4c83793b4f13bc2f3b39f1c74ff1645c4f5328bdd461cec56e2665ff631a9b |