Skip to main content

This is a simple tool to correct portuguese misspells automatically.

Project description

Spell Corrector PT

Correct automatically words in Portuguese.

How to use

  • Get word list (best to use domain-specific words to lower the computational costs)
  • Train the Model (check out example-train.py)
  • Specify the path to save the model to reuse afterward.
  • Load the Model and correct the words (check out example.py)

How the model works (high level)

  • Preprocess the dictionary removing accentuation and transform to lowercase
  • Extract char n_grams from the dictionary
  • Create a sparse matrix from the dictionary utilizing the Bag of Words strategy
  • Create a sparse matrix from the word preprocessed
  • Compare the two sparse matrices by cosine similarity
  • Return the most similar word

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for spell-corrector-pt, version 0.0.2
Filename, size File type Python version Upload date Hashes
Filename, size spell_corrector_pt-0.0.2-py2.py3-none-any.whl (4.2 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size spell_corrector_pt-0.0.2-py3-none-any.whl (4.2 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size spell_corrector_pt-0.0.2.tar.gz (3.0 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page