Skip to main content

Finding and correcting real-word spelling errors using masked language models.

Project description

rwse-checker

Real-word spelling errors (RWSEs) pose special challenges for detection methods, as they ‘hide’ in the form of another existing word and in many cases even fit in syntactically. rwse-checker is a modern Transformer-based implementation of earlier probabilistic methods based on confusion sets. It detects RWSEs with a good balance between missing errors and raising too many false alarms. The confusion sets are dynamically configurable, allowing teachers to easily adjust which errors trigger feedback.

Example Usage

from rwse import RWSE_Checker 

checker = RWSE_Checker()
checker.set_confusion_sets([['their','there'],['to','too','two']])

print(checker.check("there", "I want to buy [MASK] cars."))
print(checker.check("too", "I want [MASK] buy their cars."))

which yields

('their', 0.003510827198624611)
('to', 0.9989504218101501)

Citation

If you are using this tool, please cite Transformer-Based Real-Word Spelling Error Feedback with Configurable Confusion Sets (Zesch et al., BEA 2025)

The experimental code for this paper is found in https://github.com/zesch/rwse-experiments

@inproceedings{zesch-etal-2025-transformer,
    title = "Transformer-Based Real-Word Spelling Error Feedback with Configurable Confusion Sets",
    author = "Zesch, Torsten  and
      Gardner, Dominic  and
      Bexte, Marie",
    booktitle = "Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.bea-1.29/",
    doi = "10.18653/v1/2025.bea-1.29",
    pages = "375--383",
    ISBN = "979-8-89176-270-1",
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rwse_checker-0.0.4.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rwse_checker-0.0.4-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file rwse_checker-0.0.4.tar.gz.

File metadata

  • Download URL: rwse_checker-0.0.4.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.0 Darwin/24.6.0

File hashes

Hashes for rwse_checker-0.0.4.tar.gz
Algorithm Hash digest
SHA256 7568682f57767ad0a35848095bc8dc31b139cb454cf451a923c79368aa0141c8
MD5 5aa916be48574b0da743e466b1351dec
BLAKE2b-256 4019531b39cf1b96d2a508f3a5de10f9582fd805ce38e13ef0a721a5ecde87d1

See more details on using hashes here.

File details

Details for the file rwse_checker-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: rwse_checker-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.0 Darwin/24.6.0

File hashes

Hashes for rwse_checker-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 157cb3951143ac5d98f48982f006bcbe1052416d98ed0256fc9d2be67cc3487e
MD5 ec46371505acc26356bda580bf2c7d07
BLAKE2b-256 a47d38f5c4dba720998e61ae6e0323cc7bf6c75785f34d2f199a52461c4788cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page