Skip to main content

This is a simple tool to correct portuguese misspells automatically.

Project description

Spell Corrector PT

Correct automatically words in Portuguese.

How to use

  • Get word list (best to use domain-specific words to lower the computational costs)
  • Train the Model (check out example-train.py)
  • Specify the path to save the model to reuse afterward.
  • Load the Model and correct the words (check out example.py)

How the model works (high level)

  • Preprocess the dictionary removing accentuation and transform to lowercase
  • Extract char n_grams from the dictionary
  • Create a sparse matrix from the dictionary utilizing the Bag of Words strategy
  • Create a sparse matrix from the word preprocessed
  • Compare the two sparse matrices by cosine similarity
  • Return the most similar word

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spell_corrector_pt-0.0.2.tar.gz (3.0 kB view details)

Uploaded Source

Built Distributions

spell_corrector_pt-0.0.2-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

spell_corrector_pt-0.0.2-py2.py3-none-any.whl (4.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file spell_corrector_pt-0.0.2.tar.gz.

File metadata

  • Download URL: spell_corrector_pt-0.0.2.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for spell_corrector_pt-0.0.2.tar.gz
Algorithm Hash digest
SHA256 cbeef8a8d02da5d1364cf3e4e47321163d3d3ed6281f1d26661b91858a1963f1
MD5 93c2bb8769cfb751b70c55e46a117165
BLAKE2b-256 1040d7de5dba0f3acd79accf2db30d0fc48b4d9ba2d95cbd9716468caa13b07e

See more details on using hashes here.

File details

Details for the file spell_corrector_pt-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: spell_corrector_pt-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for spell_corrector_pt-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 162aa2f80908c5e15d912f19aabe8736f3d00452576d30c9c99b0bfe8425b84d
MD5 69558a7b455fd1c5d98303292b6f7421
BLAKE2b-256 905c0cb1c349320674b45189797330e3244dbfb3d8e4eaedb56feecf08fae4f3

See more details on using hashes here.

File details

Details for the file spell_corrector_pt-0.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: spell_corrector_pt-0.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for spell_corrector_pt-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8584cdc3b7daff3a687394d818cc3ddb667cc59ecd375267efcaac2b6b553c7c
MD5 dd49bdf531b2304e5e153aa39f6a0c9f
BLAKE2b-256 78af9ab36ac7d96a7d840d743098a977a2eb40028fd2b2971d40fa1d14c10a0c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page