Arabic Spelling Correction
Project description
Description
Simple library to check the spelling of arabic sentences. This library uses a vocabulary that consists of +500K words, and uses 1-edit_distance and 2-edit_distance to correct the misspelled words. It also uses 1-ngram language model to correct the words depending on the previous context.
Installation
pip install ar-corrector
Usage
Correct word spelling
from ar_corrector.corrector import Corrector
corr = Corrector()
corr.spell_correct('بختب') # return 5 corrections with top frequencies
# [('بكتب', 61), ('برتب', 22), ('بختم', 21), ('بختي', 9), ('بخت', 7)]
corr.spell_correct('بختب', 2) # return 2 corrections with top frequencies
# [('بكتب', 61), ('برتب', 22),]
corr.spell_correct('بختب', 1) # return 1 correction with top frequency
# [('بكتب', 61)]
corr.spell_correct('لتمشتلميتلكب', 4) # return the same word
# لتمشتلميتلكب
corr.spell_correct('من') # return true
# True
Correct word spelling using the context
from ar_corrector.corrector import Corrector
corr = Corrector()
sent = 'أكدت قواءص التمذد في تشاد أنها تواضضل طريقها للعاحمة'
print(corr.contextual_correct(sent))
#أكدت قوات التمرد في تشاد أنها تواصل طريقها للعاصمة
sent = 'اتتنتهى حدث آبل المنتظو بالإعلاخ عن مموعة من المنتجات'
print(corr.contextual_correct(sent))
#انتهى حدث آبل المنتظر الإعلان عن مجموعة من المنتجات
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ar_corrector-1.1.6.tar.gz
(51.5 MB
view hashes)
Built Distribution
Close
Hashes for ar_corrector-1.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3d4a4fe8ff8696c922a5807757506e56c0d720258c7716bfca607b90a8e77f7 |
|
MD5 | e882c98f6f0242c87fbe9b432e8ebb0f |
|
BLAKE2b-256 | 2d32ce0b2170db48ab9227d08802f93ce1cb105fdcc97c532333fadc38281bf8 |