Skip to main content

A script to automatically spell checks comments of a codebase.

Project description

CommentSpellCheck

python testing

The CommentSpellCheck (CSC) package provides a script that automatically spell checks the comments of a code base. It was originally developed to be run on the SimpleITK and ITK code bases.

Here is how it is typically run:

python comment_spell_check.py --exclude Ancillary $SIMPLEITK_SOURCE_DIR/Code

This command will recursively find all the '.h' files in a directory, extract the C/C++ comments from the code, and run a spell checker on them. The '--exclude' flag tells the script to ignore any file that has 'Ancillary' in its full path name. This flag will accept any regular expression.

In addition to pyenchant's English dictionary, we use the words in additional_dictionary.txt. These words are proper names and technical terms harvest by hand from the SimpleITK and ITK code bases.

If a word is not found in the dictionaries, we try two additional checks.

  1. If the word starts with some known prefix, the prefix is removed and the remaining word is checked against the dictionary. The prefixes used by default are 'sitk', 'itk', and 'vtk'. Additional prefixes can be specified with the '--prefix' command line argument.

  2. We attempt to split the word by capitalization and check each sub-word against the dictionary. This method is an attempt to detect camel-case words such as 'GetArrayFromImage', which would get split into 'Get', 'Array', 'From', and 'Image'. Camel-case words are very commonly used for code elements.

The script can also process other file types. With the '--suffix' option, the following file types are available: Python (.py), C/C++ (.c/.cxx), CSharp (.cs), Text (.txt), reStructuredText(.rst), Markdown (.md), Ruby (.ruby), R (.R), and Java (.java). Note that reStructuredText files are treated as standard text. Consequentially, all markup keywords that are not actual words will need to be added to the additional/exception dictionary.

Disabling Spell Checking

Spell checking can be disabled for sections of code by using special

comments. The following comments will disable spell checking until the corresponding end comment is found.

// spell-check-disable

// This comment will not be spell checked.

// spell-check-enable

Note that for C-style, multi-line comments, the disable and enable comments must be in seperate comments. If the disable command is found in a multi-line comment, spell checking will be disabled for the entire multi-line comment.

/*
spell-check-disable
spell-check-enable
This comment will NOT be spell checked
*/
/* spell-check-enable */
/* This comment WILL be spell checked */

Dictionary notes

We use PySpellChecker as the underlying spellchecker and its associated default dictionary. Previously we used the pyenchant package and its default dictionary. The pyenchant package requires an underlying C library, which is not available on all platforms. PySpellChecker is a pure Python package and works on all platforms with no additional dependencies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

comment_spell_check-0.4.4.tar.gz (89.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

comment_spell_check-0.4.4-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file comment_spell_check-0.4.4.tar.gz.

File metadata

  • Download URL: comment_spell_check-0.4.4.tar.gz
  • Upload date:
  • Size: 89.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for comment_spell_check-0.4.4.tar.gz
Algorithm Hash digest
SHA256 9212a1a962df9366965d95399d3ffba1f3ec82bd6f088c53a65654cc1f8a7396
MD5 17f5e929a9f3be3f2c4074e2c6ec535c
BLAKE2b-256 c65061b2cf9201dd7a9dadd1ef6299ffdf1251d5cad24a92970d325e7d96bb8c

See more details on using hashes here.

File details

Details for the file comment_spell_check-0.4.4-py3-none-any.whl.

File metadata

File hashes

Hashes for comment_spell_check-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9968046b233b5fe9fd8dda807aafdc48107154b3d88117814d26a22b84ce0d9a
MD5 d0bd14529282506aae95e3c8401d6213
BLAKE2b-256 4905a33083d1173bf589d38986a1795f0166f05dc93e662227fd25a271857276

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page