Word frequency checker based on Wikipedia corpus written in Rust
Project description
Word frequency checker based on Wikipedia corpus written in Rust
Table of Contents
About The Project
A library written in Rust to check against Wikipedia word frequency corpus. The library is very fast, memory efficient, and safe. Full lookups are done using a Hashmap data structure. Partial frequency searches are based on a Suffix Array data structure suffix to perform quick sub-patterns lookups over the dictionary.
Built With
Installation
pip3 install pywordfreq
Usage
import pywordfreq
# Initialize the engine loaded with the dictionary
# It is worth to mention that there is a significant ammount
# of memory overhead for the engine, but multiple instances
# of the class would share the same dictionary in memory.
word_freq_db = pywordfreq.WordFrequency()
# This function checks the frequency of the word "the" in the corpus
word_freq_db.word_frequency(
word="the",
)
# This function checks the frequency of the word "inter" as a pattern
# in other words of the dictionary.
word_freq_db.word_partial_frequency(
word="inter",
)
License
Distributed under the MIT License. See LICENSE
for more information.
Contact
Gal Ben David - gal@intsights.com
Project Link: https://github.com/intsights/pywordfreq
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
Close
Hashes for pywordfreq-0.1.0-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91847efe5ddef22e6307a34f95c0c237bf68c0c0a5001550d7e472e629325e39 |
|
MD5 | 16f0cbe03965a1c5a349ac3d2895b6ad |
|
BLAKE2b-256 | e8fd6e7e6fdc0f43dfd0d5fbcc86e2b2269d3b515d00815074be5629a8b28112 |
Close
Hashes for pywordfreq-0.1.0-cp39-cp39-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 76d21a6d50601686f5a34f3b0801c4504a6123966138e7d5365d6d8768397a54 |
|
MD5 | b39699cad5a5c5e08d6325fef5569e60 |
|
BLAKE2b-256 | 868709eab105f903877e1c30acf51e511485a43a5d2ae4cf837d13b412d62cef |
Close
Hashes for pywordfreq-0.1.0-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09a30d852863216a7dbcd254ca4b05912f453df7a4470150f7cd14cccc6ad555 |
|
MD5 | c0193c9eaf7f98a91b2b42e8f2e5801e |
|
BLAKE2b-256 | 6849f34ff57e9db4ef5044d70c3de567358f761a3e0fb6871c667c0c0212c280 |
Close
Hashes for pywordfreq-0.1.0-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f1ec2d6cd1d33a4154cba7040ec35c0d9cff04f17170adcc37677ad311fda2c |
|
MD5 | 9c596a067336e347467d895551bf1709 |
|
BLAKE2b-256 | 0ceffeff2f02cde3056724ba57e02c0f63a1cc0eda5509c02342616c9c8224c0 |
Close
Hashes for pywordfreq-0.1.0-cp38-cp38-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c96069231cd1d37cce39ac5ec2c00bcba8d067b76bf2984d3c43d8e6bfe1e30b |
|
MD5 | 3c42dec1949e282cd6aff8cf96c7be15 |
|
BLAKE2b-256 | 5670d13acbc7db642c8734457dbd600211708d95923a7d14905f840d2a293be7 |
Close
Hashes for pywordfreq-0.1.0-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 964f68267ccfc50ec36c58a6f27712ec24afb25a191d8aec5146d3a53c0108d1 |
|
MD5 | 3f468b180dd4c351b5af126825e4eff7 |
|
BLAKE2b-256 | d1bf6440e54cf2501ab5ae57ed5f64bdb85faf1f6094763465165f4cbf22b638 |
Close
Hashes for pywordfreq-0.1.0-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb17b2313dab31682768da8b795bcdb36213b2406feee2ead3f9334fe7456583 |
|
MD5 | 70433715dcd87ffb87c965aaa72b782a |
|
BLAKE2b-256 | 50bc510fff3a544b642a7356b02a3072c8dd2bb650965b8cc8ece9e883cf4a07 |
Close
Hashes for pywordfreq-0.1.0-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72a35aeff38651e1cfab14380c274876494bfa6f2f3a26d02fb88ab5aee08d97 |
|
MD5 | 8c673acd54bc516fe2ed0f667c5db714 |
|
BLAKE2b-256 | f49e24d97e6e952cac6f8dc272ed1d55681bfe247153f916f03d8cf36fe96799 |
Close
Hashes for pywordfreq-0.1.0-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58bb1f4b72fb7d2fecd1cc87617bd1928b4397d13ebbaeb6fe8ba8ec7fe671f3 |
|
MD5 | 208df6c30a3fc8b8cb639fe2519ded25 |
|
BLAKE2b-256 | 1a504118f2f1e16b69e81e11f3ed048d84358544b51dfc80d0eb7ebb78276de5 |
Close
Hashes for pywordfreq-0.1.0-cp36-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f3a7b54a1b14c5b23b5b574958acbe1aaeef45b36b4bae4cd84eb18534bb33a |
|
MD5 | 7b3d6cd16d64f0743a1fc45a25812718 |
|
BLAKE2b-256 | 2eec52bd9b0ba88bc74fee6e94370abe8af4f06850d00e379862b1c629ba66f8 |
Close
Hashes for pywordfreq-0.1.0-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f937016aa08f460d96d9d764087b54f02814dd078c53cb40944059e5f72804b6 |
|
MD5 | 58846c9dd6ea6dc705ff895f50cb4e83 |
|
BLAKE2b-256 | 25b1ba25a6b1342df744d25281bc75984f4e2787c4991ccfa1e85744ddef8f84 |
Close
Hashes for pywordfreq-0.1.0-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fab7903ae5e563dbf0011ce2f3395576c933b53e5de7006b09b0783ba2f63042 |
|
MD5 | 0fe9063b6e82af2aa1d38d593d05234f |
|
BLAKE2b-256 | eacecd836255b16c3be6f217e9c639b90dbc2700835ac68ee55a221b34bb2494 |