Python spell checker library; supports English, German, Italian, Spanish, French, and even Code!
Project description
Python Spell Checker
The spell checker is a python package that can check your spelling. It supports 6 languages!
It contains the following languages:
- English
- German
- Spanish
- French
- Italian
- Code
Code is a special language that contains the english language along with syntaxes and keywords of programming languages. This is to prevent the spell checker from marking them as misspelt words when the checker is used on a codebase. The words are stored in code.txt
in languages
folder.
All valid words of a language are stored in a text file in the languages
folder. The words are sorted alphabetically and are in lowercase. You can add your own words to it. You can also delete words from the text file to exclude them as valid words.
Installation
The easiest way to install the package is to use pip
or pip3
:
pip install python-spell-checker
or you can clone the repository and use the files:
git clone https://github.com/dev0cloo/python-spell-checker.git
cd python-spell-checker
Usage
The main file is checker.py
in the python_spell
folder.
The SpellChecker
class is used to check the spelling of a text. It takes two arguments:
text
: The text to be checkedlanguage
: The language of the text. It is set toenglish
by default. The language must be one of the following:english
german
spanish
french
italian
code
The check()
method is used to check the spelling of the text. It returns a dictionary object with the following relevant keys:
total_words
: The total number of words in the textmisspelled_num
: The number of misspelled words in the textwords_in_dictionary
: The number of words in the dictionary of the languagemisspelled_words
: A list of misspelled words
The has_typos
attribute returns a boolean for if the text has typos or not.
The number_of_typos()
method returns the number of typos in the text. It takes an optional argument duplicates
which is set to False
by default. If it is set to True
, then each duplicate word is counted as a typo. If it is set to False
, then each duplicate word is counted as a single typo.
The get_typos()
method returns a list of misspelt words. It takes the optional arguments duplicates
and exclude
. duplicates
is set to False
by default and works the same as in number_of_typos()
. exclude
is a string or a list of strings that are excluded from the returned list of misspelt words. It is set to None
by default.
exclude
can be used to exclude words that are not in the dictionary but are valid words without changing the language file. It takes an optional argument temp
which is set to False
by default. If it is set to True
, then the words are excluded only for the current method call and not the whole instance. If it is set to False
, then the words are excluded for the whole instance. This is useful if you want to exclude words for a single method call but not for the whole instance.
Note
The SpellChecker class calls the check()
method when it is initialized. So, you can directly use the has_typos
, number_of_typos()
, get_typos()
and exclude
methods without first calling the check()
method. You can access the dictionary object returned by the check()
method using the checked
attribute.
Example:
from python_spell.checker import SpellChecker
text = "The... ! quick browmn fox jumps-over the lazi dog"
checker = SpellChecker(text, "english")
# This returns a dictionary object with all information
check_text = checker.checked
print(check_text)
# This returns a boolean for if the text has typos or not
is_valid = checker.has_typos
print("Does the text have typos? " , is_valid)
# This returns the number of typos (duplicate words are )
num_typos = checker.number_of_typos()
print("Number of typos are " , num_typos)
# This returns a list of misspelt words
misspelt_words = checker.get_typos()
print("Misspelt words are " , misspelt_words)
# This returns a list of misspelt words excluding the words in the list
misspelt_words = checker.get_typos(exclude="lazi")
print("Misspelt words with exlusion are " , misspelt_words)
# This returns a list of misspelt words excluding the words in the list for the current instance
checker.exclude("browmn") # This excludes the word "browmn" for the whole instance
misspelt_words = checker.get_typos()
print("Misspelt words after using the exclude method first are " , misspelt_words)
Output:
{
"Total words": 9
"Number of misspelled words": 2
"Number of words in dictionary": 194433
"Misspelled words": ['browmn', 'lazi']
"Lookup time(s)": 0.0005731582641601562
}
Does the text have typos? True
Number of typos are 2
Misspelt words are ['browmn', 'lazi']
Misspelt words with exlusion are ['browmn']
Misspelt words after using the exclude method first are ['lazi']
Contributing
We welcome contributions to this project, especially those that help us fix bugs, add new words or improve the docs. If you would like to contribute, please fork the repo and submit a pull request.
Credits
Credits to Harvard90873 for creating spell-checker which is the base of this package.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for python_spell_checker-1.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0276d6309be924689746cd7f0d92446b0cce03f78e0bf295613f872627ecfbb1 |
|
MD5 | 5087823898d0d9c75da3f17d22527c5e |
|
BLAKE2b-256 | d5b809248c3c2967be4463c298b67cb104a13644e63be2a7be87e63c35f5c95c |
Hashes for python_spell_checker-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 686ce8f0a765e6724b1c0445d66998cf5ce3e38915ec23e8715b7c6a80dbdf81 |
|
MD5 | f7ed798284c0fe177edf0c1dff938b2f |
|
BLAKE2b-256 | 15f8335f77cb0bf9b2db5a630698a96e380d3614d0c9e9805f92b2f7f6557e70 |