Skip to main content

Static Hash-Based Lookup for Google Ngram Frequencies

Project description

gngram-lookup

PyPI version Python 3.11+

Word frequency from 500 years of books. O(1) lookup. 5 million words.

Install

pip install gngram-lookup
python -m gngram_lookup.download_data

Python

import gngram_lookup as ng

ng.exists('computer')       # True
ng.exists('xyznotaword')    # False

ng.frequency('computer')
# {'peak_tf': 2000, 'peak_df': 2000, 'sum_tf': 892451, 'sum_df': 312876}

ng.batch_frequency(['the', 'algorithm', 'xyznotaword'])
# {'the': {...}, 'algorithm': {...}, 'xyznotaword': None}

CLI

gngram-exists computer    # True, exit 0
gngram-exists xyznotaword # False, exit 1

gngram-freq computer
# peak_tf_decade: 2000
# peak_df_decade: 2000
# sum_tf: 892451
# sum_df: 312876

Docs

Attribution

Data derived from the Google Books Ngram dataset.

License

Proprietary. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gngram_lookup-0.2.0.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gngram_lookup-0.2.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file gngram_lookup-0.2.0.tar.gz.

File metadata

  • Download URL: gngram_lookup-0.2.0.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.11.9 Darwin/24.6.0

File hashes

Hashes for gngram_lookup-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c84177a11e073345f856357a39d14819897653600706337d2c370248dc900606
MD5 d207454a67b7bde2b2f12b22b084651a
BLAKE2b-256 8f65d21844d7f281b6bd89f9c87f49b2e101faeb3919f870504ef68a6c52349a

See more details on using hashes here.

File details

Details for the file gngram_lookup-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: gngram_lookup-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.11.9 Darwin/24.6.0

File hashes

Hashes for gngram_lookup-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5c160ecfde50bdd4542b083e376e5475aa29d72965ca378b98221599decb8ef
MD5 cc6910471ec2e388feb34af437ffaa10
BLAKE2b-256 51024ce7abcacd158cd26d848c755971f1ea0ef58306fb116b785b6feae2aa19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page