Skip to main content

A package for censoring profanity in text

Project description

Banner

censore

A tool for censoring obscene language

Report Bug · Request Feature

About The Project

This tool helps identify and censor profanity with high accuracy. It supports multiple languages and can be used in various scenarios, such as chats, social networks, and more.

⚙️ Installation

pip3 install censore

🚀 Usage

ProfanityFilter

First, initialize the ProfanityFilter class:

from censore import ProfanityFilter
pf = ProfanityFilter()

By default ProfanityFilter will be initialized with all available languages

If you want to process the text only in certain languages, it is recommended to manually configure the list of languages ​​to be checked, this will greatly speed up the text processing:

from censore import ProfanityFilter
pf = ProfanityFilter(languages=['en', 'uk'])

custom_patterns

You can add custom patterns of offensive words if they are not present:

pf = ProfanityFilter(custom_patterns=["bad"])
text = "This is a very bad text, let's say 'baddest'"
pf.censor(text)
# 'This is a very ### text, let we say '#######''

As you can see it also censored the word "baddest" because it contains "bad"

But if you don't want it to censor that word, you can use the custom_exclude_patterns parameter:

pf = ProfanityFilter(custom_patterns=["bad"], custom_exclude_patterns=["baddest"])
text = "This is a very bad text, let we say 'baddest'"
pf.censor(text)
# 'This is a very ### text, let we say 'baddest''

It will use only English (en) and Ukrainian (uk) languages

contains_profanity

The contains_profanity method checks if the text contains obscene language:

text = "This is a fucking bad text"
pf.contains_profanity(text)
# True

censor

The censor method replaces obscene language with hashes (#):

text = "This is a fucking bad text"
pf.censor(text)
# 'This is a ####### bad text'

partial_censor

You can also partially censor text using the partial_censor option:

pf.censor(text, partial_censor=True)
# 'This is a fu###ng bad text'

censor_symbol

You can replace the hashes with any symbol, such as a monkey emoji 🙈:

pf.censor(text, censor_symbol="🙈")
# 'This is a 🙈🙈🙈🙈🙈🙈🙈 bad text'

languages

It may be that you initialized only English and Ukrainian, but at some point you need to use Polish, for this you can use the languages parameter:

text = "This is a kurwa блять bad text"
pf.censor(text, languages=['en', 'uk', 'pl'])
# 'This is a ##### ##### bad text'

It automatically initialized and loaded the Polish language into the list of languages and successfully censored everything, but don't you want to enter the list of languages ​​you want to use every time?

To simplify this, use the additional_languages option:

pf.censor(text, additional_languages=['pl'])
# 'This is a ##### ##### bad text'

It has now added Polish to all initialized languages ​​and now we don't need to enter the full list of languages


Other methods

censor_word

This method censors any word

pf.censor_word("anyword")
# '###'

partial_censor

You can also partially censor text using the partial_censor option:

pf.censor_word("anyword", partial_censor=True)
# 'an###rd'

🤝 Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📝 License

Distributed under the MIT License. See LICENSE for more information.

📨 Contact

Telegram - @okineadev

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

censore-0.2.1.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

censore-0.2.1-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file censore-0.2.1.tar.gz.

File metadata

  • Download URL: censore-0.2.1.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for censore-0.2.1.tar.gz
Algorithm Hash digest
SHA256 dfd8e42d433d1ee337c5df8d20c164d316b10cd26086706cd9f5a63d7d0b1f93
MD5 acd56c5d9330be7cfa1481fc82378b07
BLAKE2b-256 7e4da7d0cae0cb2892e8575756feb934d054a574920cdae0f103fe6833b97e58

See more details on using hashes here.

File details

Details for the file censore-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: censore-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for censore-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a218bad3e1cc93233c92f272219d38b47e5a735d99b9fab58f12c81857017630
MD5 9943a9a6737b5b0faabab272a4d4942a
BLAKE2b-256 898733d0d5a029cb5fe9e19380791ce9bfd8887790223d70454de2f7cdaa4e8a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page