Skip to main content

A Python library to clean English swear words in strings

Project description

better_profanity

A Python library to clean swear words in strings.

release Build Status python license

Inspired from package profanity of Ben Friedland, this library is much faster than the original one, by using string comparison instead of regex.

Requirements

To make use of Python static tying, this package only works with Python 3.6+.

Unicode characters

A huge thanks to @Derfirm for adding support for Unicode characters.

For release 0.3-beta.0, only Unicode characters from categories Ll, Lu, Mc and Mn are added. More on Unicode categories can be found here.

Usage

By default, on the first .censor() call, profanity initializes a set of words, from profanity_wordlist.txt, to be used to compare against the input texts. This set of words will be stored in memory (~5MB+).

1. Censor swear words from a text

By default, profanity replaces each swear words with 4 asterisks ****.

from better_profanity import profanity

if __name__ == "__main__":
    text = "You p1ec3 of sHit."

    censored_text = profanity.censor(text)
    print(censored_text)
    # You **** of ****.

2. Censor doesn't care about word dividers

The function .censor() also hide words separated not just by an empty space but also other dividers, such as _, , and .. Except for @, $, ^, *, &, \, \.

from better_profanity import profanity

if __name__ == "__main__":
    text = "...shit...hello_cat_fuck,,,,123"

    censored_text = profanity.censor(text)
    print(censored_text)
    # "...****...hello_cat_****,,,,123"

3. Censor swear words with custom character

4 instances of the character in second parameter in .censor() will be used to replace the swear words.

from better_profanity import profanity

if __name__ == "__main__":
    text = "You p1ec3 of sHit."

    censored_text = profanity.censor(text, '-')
    print(censored_text)
    # You ---- of ----.

4. Check if the string contains any swear words

from better_profanity import profanity

if __name__ == "__main__":
    dirty_text = "That l3sbi4n did a very good H4ndjob."

    profanity.contains_profanity(dirty_text)
    # True

5. Censor swear words with a custom wordlist

The provided list of words will replace the default wordlist.

4 instances of the character in second parameter in .censor() will be used to replace the swear words.

from better_profanity import profanity

if __name__ == "__main__":
    text = "You p1ec3 of sHit."

    custom_badwords = ['happy', 'jolly', 'merry']
    profanity.load_censor_words(custom_badwords)

    print(profanity.contains_profanity("Fuck you!"))
    # Fuck you

    print(profanity.contains_profanity("Have a merry day! :)"))
    # Have a **** day! :)

6. Censor Unicode characters

from better_profanity import profanity

if __name__ == "__main__":
    bad_text = "Эффекти́вного противоя́дия от я́да фу́гу не существу́ет до сих пор"
    profanity.load_censor_words(["противоя́дия"])

    censored_text = profanity.censor(text)
    print(censored_text)
    # Эффекти́вного **** от я́да фу́гу не существу́ет до сих пор

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

better_profanity-0.3b0.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

better_profanity-0.3b0-py3-none-any.whl (39.4 kB view details)

Uploaded Python 3

File details

Details for the file better_profanity-0.3b0.tar.gz.

File metadata

  • Download URL: better_profanity-0.3b0.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for better_profanity-0.3b0.tar.gz
Algorithm Hash digest
SHA256 dbfd6fc55e17794cafdf12dc7b2bef8bf95912a51b347789908d763d9c5d0021
MD5 19c76d5ba58326b568a7f7bfd4ebdfdf
BLAKE2b-256 7d5e761c38bd25e6c12b965b2d0b63990109a644fa4d3e8de4fa0abc18d64014

See more details on using hashes here.

File details

Details for the file better_profanity-0.3b0-py3-none-any.whl.

File metadata

  • Download URL: better_profanity-0.3b0-py3-none-any.whl
  • Upload date:
  • Size: 39.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for better_profanity-0.3b0-py3-none-any.whl
Algorithm Hash digest
SHA256 91c86bb78a86b6dc9c119d484b65141c998f162ed1b6e3b945594a70ef1c297c
MD5 444b5161e8cbfa489682ffcbc78a9d35
BLAKE2b-256 fcca19ed97881f22a875b6238b18f613ee727bc62a381a21896ed28bcd936d16

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page