Skip to main content

corrects English spelling mistakes and normalize. (e.g., "cooooooooooooooollllllllllllll" to "cool")

Project description

pytypo

travis-ci.org coveralls.io Code Health pyversion latest version license

pytypo corrects English spelling mistakes. That feature is based on TYPO CORPUS (http://luululu.com/tweet/) and Wikipedia (https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines)

And this module normalizes also lengthened English expression having repeating letters. (e.g., this module converts “cooooooooooooooollllllllllllll” to “cool”)

That feature is based on the following paper: Samuel Brody and Nicholas Diakopoulos. Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! using word lengthening to detect sentiment in microblogs. In EMNLP2011, pp. 562-570, 2011. http://aclweb.org/anthology//D/D11/D11-1052.pdf

Contributions are welcome!

Installation

$ pip install pytypo

Usage

Import pytypo

>>> import pytypo

correct sentence

>>> pytypo.correct_sentence('you are coooolll!!!')
you are cool!
  • correct_sentence(str)

correct word

>>> pytypo.correct('okayyyyy')
okay
  • correct(str)

Shorten repeated substring until threshould without dictionary

>>> pytypo.cut_repeat('mamisaaaaaan', 1)
mamisan
>>> pytypo.cut_repeat('okayyyyy', 2)
okayy
  • cut_repeat(str, threshould)

    • Note that this method don’t use a lengthened expression normalize table (e.g., cooll -> cool). If you want to normalize such expression, use correct() or correct_sentence() method.

License

  • This module is licensed under MIT License.

CHANGES

0.3 (2017-10-18)

Add many cases from Wikipedia

0.2 (2016-04-15)

Add many cases

0.1 (2016-04-14)

First release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytypo-0.3.0.tar.gz (74.2 kB view details)

Uploaded Source

File details

Details for the file pytypo-0.3.0.tar.gz.

File metadata

  • Download URL: pytypo-0.3.0.tar.gz
  • Upload date:
  • Size: 74.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pytypo-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ab0467dc93725b4fb1d2c2c15ede93a1825ec27fae63bd1af59d7c8d257913cc
MD5 898101f73213801af79dea9b068b05ca
BLAKE2b-256 9f80b0578690bcac288cf9af76abecf2fd30978ea75ebe67da817c018c444abb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page