Finding and correcting real-word spelling errors using masked language models.
Project description
rwse-checker
Real-word spelling errors (RWSEs) pose special challenges for detection methods, as they ‘hide’ in the form of another existing word and in many cases even fit in syntactically. rwse-checker is a modern Transformer-based implementation of earlier probabilistic methods based on confusion sets. It detects RWSEs with a good balance between missing errors and raising too many false alarms. The confusion sets are dynamically configurable, allowing teachers to easily adjust which errors trigger feedback.
Example Usage
from rwse import RWSE_Checker
checker = RWSE_Checker()
checker.set_confusion_sets([['their','there'],['to','too','two']])
print(checker.check("there", "I want to buy [MASK] cars."))
print(checker.check("too", "I want [MASK] buy their cars."))
which yields
('their', 0.003510827198624611)
('to', 0.9989504218101501)
Citation
If you are using this tool, please cite Transformer-Based Real-Word Spelling Error Feedback with Configurable Confusion Sets (Zesch et al., BEA 2025)
The experimental code for this paper is found in https://github.com/zesch/rwse-experiments
@inproceedings{zesch-etal-2025-transformer,
title = "Transformer-Based Real-Word Spelling Error Feedback with Configurable Confusion Sets",
author = "Zesch, Torsten and
Gardner, Dominic and
Bexte, Marie",
booktitle = "Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.bea-1.29/",
doi = "10.18653/v1/2025.bea-1.29",
pages = "375--383",
ISBN = "979-8-89176-270-1",
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rwse_checker-0.0.3.tar.gz.
File metadata
- Download URL: rwse_checker-0.0.3.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.13.0 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c3389af743806ccb72b3a3e283dd5e9b3035b7a3c9b4fe2db6b6f42ef5ecca8
|
|
| MD5 |
c1a2147027c0b6ab86684a38b08c4e53
|
|
| BLAKE2b-256 |
5fef00b41f84dbeb7b683edda9b5d2eb3f92e0c3ca4f535e46c04dfb3ecc9093
|
File details
Details for the file rwse_checker-0.0.3-py3-none-any.whl.
File metadata
- Download URL: rwse_checker-0.0.3-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.13.0 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d91726596e4db89bd7f30f8e274056e5e16ddba397ab631df8fcc30445cf9e3a
|
|
| MD5 |
ae92e63b2b225dabe9b37b2328bff390
|
|
| BLAKE2b-256 |
d9bc636d3de2430f47b0b8ad3d20beff0052f9ad82aace11541a31ce062487f8
|