Skip to main content

Python SymSpell

Project description

symspellpy
Build Status codecov

symspellpy is a Python port of SymSpell v6.3, which provides much higher speed and lower memory consumption. Unit tests from the original project are implemented to ensure the accuracy of the port.

Please note that the port has not been optimized for speed.

Usage

Adding the symspellpy module to your project

Copy the inner symspellpy directory to your project directory so you end up with the following layout:

project_dir
  +-symspellpy
  |   +-__init__.py
  |   +-editdistance.py
  |   +-frequency_dictionary_en_82_765.txt
  |   +-helpers.py
  |   \-symspell.py
  \-project.py

Sample usage

Using project.py (code is more verbose than required to allow explanation of method arguments)

import os

from symspellpy.symspellpy import SymSpell, Verbosity  # import the module

def main():
    # create object
    initial_capacity = 83000
    # maximum edit distance per dictionary precalculation
    max_edit_distance_dictionary = 2
    prefix_length = 7
    sym_spell = SymSpell(initial_capacity, max_edit_distance_dictionary,
                         prefix_length)
    # load dictionary
    dictionary_path = os.path.join(os.path.dirname(__file__), "symspellpy",
                                   "frequency_dictionary_en_82_765.txt")
    term_index = 0  # column of the term in the dictionary text file
    count_index = 1  # column of the term frequency in the dictionary text file
    if not sym_spell.load_dictionary(dictionary_path, term_index, count_index):
        print("Dictionary file not found")
        return

    # lookup suggestions for single-word input strings
    input_term = "memebers"  # misspelling of "members"
    # max edit distance per lookup
    # (max_edit_distance_lookup <= max_edit_distance_dictionary)
    max_edit_distance_lookup = 2
    suggestion_verbosity = Verbosity.CLOSEST  # TOP, CLOSEST, ALL
    suggestions = sym_spell.lookup(input_term, suggestion_verbosity,
                                   max_edit_distance_lookup)
    # display suggestion term, term frequency, and edit distance
    for suggestion in suggestions:
        print("{}, {}, {}".format(suggestion.term, suggestion.count,
                                  suggestion.distance))

    # lookup suggestions for multi-word input strings (supports compound
    # splitting & merging)
    input_term = ("whereis th elove hehad dated forImuch of thepast who "
                  "couqdn'tread in sixtgrade and ins pired him")
    # max edit distance per lookup (per single word, not per whole input string)
    max_edit_distance_lookup = 2
    suggestions = sym_spell.lookup_compound(input_term,
                                            max_edit_distance_lookup)
    # display suggestion term, edit distance, and term frequency
    for suggestion in suggestions:
        print("{}, {}, {}".format(suggestion.term, suggestion.count,
                                  suggestion.distance))

if __name__ == "__main__":
    main()
Expected output:

members, 226656153, 1

where is the love he had dated for much of the past who couldn't read in six grade and inspired him, 300000, 10

CHANGELOG

6.3.1 (2018-08-30)


  • Create a package for symspellpy

6.3.0 (2018-08-13)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symspellpy-6.3.1.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

symspellpy-6.3.1-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file symspellpy-6.3.1.tar.gz.

File metadata

  • Download URL: symspellpy-6.3.1.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.6.1

File hashes

Hashes for symspellpy-6.3.1.tar.gz
Algorithm Hash digest
SHA256 2f64bfbe96513122a6266e2c11a78214ff7a47bd1c87f531faf14a9996472522
MD5 5ddf10d2567e15672753db81a7b404fb
BLAKE2b-256 f735b522e578372f4c3eb9f90301b11642992bfe5a37c333fb8cea9d17ff89db

See more details on using hashes here.

File details

Details for the file symspellpy-6.3.1-py3-none-any.whl.

File metadata

  • Download URL: symspellpy-6.3.1-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.6.1

File hashes

Hashes for symspellpy-6.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2a5c7d9fc0a6fb23eb537f29997dd5b240530595bc59404760f54140ae2d72ba
MD5 ea5fbd32d02e53af993f55c05c2331fb
BLAKE2b-256 c4bb9ff151e5176233020c9bcaa905b94541ff2ed5a8a38ac61cc48befe6419a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page