Skip to main content

Finding and correcting real-word spelling errors using masked language models.

Project description

rwse-checker

Real-word spelling errors (RWSEs) pose special challenges for detection methods, as they ‘hide’ in the form of another existing word and in many cases even fit in syntactically. rwse-checker is a modern Transformer-based implementation of earlier probabilistic methods based on confusion sets. It detects RWSEs with a good balance between missing errors and raising too many false alarms. The confusion sets are dynamically configurable, allowing teachers to easily adjust which errors trigger feedback.

Example Usage

from rwse import RWSE_Checker 

checker = RWSE_Checker()
checker.set_confusion_sets([['their','there'],['to','too','two']])

print(checker.check("there", "I want to buy [MASK] cars."))
print(checker.check("too", "I want [MASK] buy their cars."))

which yields

('their', 0.003510827198624611)
('to', 0.9989504218101501)

Citation

If you are using this tool, please cite Transformer-Based Real-Word Spelling Error Feedback with Configurable Confusion Sets (Zesch et al., BEA 2025)

The experimental code for this paper is found in https://github.com/zesch/rwse-experiments

@inproceedings{zesch-etal-2025-transformer,
    title = "Transformer-Based Real-Word Spelling Error Feedback with Configurable Confusion Sets",
    author = "Zesch, Torsten  and
      Gardner, Dominic  and
      Bexte, Marie",
    booktitle = "Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.bea-1.29/",
    doi = "10.18653/v1/2025.bea-1.29",
    pages = "375--383",
    ISBN = "979-8-89176-270-1",
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rwse_checker-0.0.3.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rwse_checker-0.0.3-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file rwse_checker-0.0.3.tar.gz.

File metadata

  • Download URL: rwse_checker-0.0.3.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.0 Darwin/24.6.0

File hashes

Hashes for rwse_checker-0.0.3.tar.gz
Algorithm Hash digest
SHA256 8c3389af743806ccb72b3a3e283dd5e9b3035b7a3c9b4fe2db6b6f42ef5ecca8
MD5 c1a2147027c0b6ab86684a38b08c4e53
BLAKE2b-256 5fef00b41f84dbeb7b683edda9b5d2eb3f92e0c3ca4f535e46c04dfb3ecc9093

See more details on using hashes here.

File details

Details for the file rwse_checker-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: rwse_checker-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.0 Darwin/24.6.0

File hashes

Hashes for rwse_checker-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d91726596e4db89bd7f30f8e274056e5e16ddba397ab631df8fcc30445cf9e3a
MD5 ae92e63b2b225dabe9b37b2328bff390
BLAKE2b-256 d9bc636d3de2430f47b0b8ad3d20beff0052f9ad82aace11541a31ce062487f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page