A package for censoring profanity in text
Project description
About The Project
This tool helps identify and censor profanity with high accuracy. It supports multiple languages and can be used in various scenarios, such as chats, social networks, and more.
⚙️ Installation
pip3 install censore
🚀 Usage
ProfanityFilter
First, initialize the ProfanityFilter
class:
from censore import ProfanityFilter
pf = ProfanityFilter()
By default ProfanityFilter
will be initialized with all available languages
If you want to process the text only in certain languages, it is recommended to manually configure the list of languages to be checked, this will greatly speed up the text processing:
from censore import ProfanityFilter
pf = ProfanityFilter(languages=['en', 'uk'])
custom_patterns
You can add custom patterns of offensive words if they are not present:
pf = ProfanityFilter(custom_patterns=["bad"])
text = "This is a very bad text, let's say 'baddest'"
pf.censor(text)
# 'This is a very ### text, let we say '#######''
As you can see it also censored the word "baddest" because it contains "bad"
But if you don't want it to censor that word, you can use the custom_exclude_patterns
parameter:
pf = ProfanityFilter(custom_patterns=["bad"], custom_exclude_patterns=["baddest"])
text = "This is a very bad text, let we say 'baddest'"
pf.censor(text)
# 'This is a very ### text, let we say 'baddest''
It will use only English (en
) and Ukrainian (uk
) languages
contains_profanity
The contains_profanity
method checks if the text contains obscene language:
text = "This is a fucking bad text"
pf.contains_profanity(text)
# True
censor
The censor
method replaces obscene language with hashes (#
):
text = "This is a fucking bad text"
pf.censor(text)
# 'This is a ####### bad text'
partial_censor
You can also partially censor text using the partial_censor
option:
pf.censor(text, partial_censor=True)
# 'This is a fu###ng bad text'
censor_symbol
You can replace the hashes with any symbol, such as a monkey emoji 🙈:
pf.censor(text, censor_symbol="🙈")
# 'This is a 🙈🙈🙈🙈🙈🙈🙈 bad text'
languages
It may be that you initialized only English and Ukrainian, but at some point you need to use Polish, for this you can use the languages
parameter:
text = "This is a kurwa блять bad text"
pf.censor(text, languages=['en', 'uk', 'pl'])
# 'This is a ##### ##### bad text'
It automatically initialized and loaded the Polish language into the list of languages and successfully censored everything, but don't you want to enter the list of languages you want to use every time?
To simplify this, use the additional_languages
option:
pf.censor(text, additional_languages=['pl'])
# 'This is a ##### ##### bad text'
It has now added Polish to all initialized languages and now we don't need to enter the full list of languages
Other methods
censor_word
This method censors any word
pf.censor_word("anyword")
# '###'
partial_censor
You can also partially censor text using the partial_censor
option:
pf.censor_word("anyword", partial_censor=True)
# 'an###rd'
🤝 Contributing
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
📝 License
Distributed under the MIT License. See LICENSE for more information.
📨 Contact
Telegram - @okineadev
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file censore-0.2.1.tar.gz
.
File metadata
- Download URL: censore-0.2.1.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfd8e42d433d1ee337c5df8d20c164d316b10cd26086706cd9f5a63d7d0b1f93 |
|
MD5 | acd56c5d9330be7cfa1481fc82378b07 |
|
BLAKE2b-256 | 7e4da7d0cae0cb2892e8575756feb934d054a574920cdae0f103fe6833b97e58 |
File details
Details for the file censore-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: censore-0.2.1-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a218bad3e1cc93233c92f272219d38b47e5a735d99b9fab58f12c81857017630 |
|
MD5 | 9943a9a6737b5b0faabab272a4d4942a |
|
BLAKE2b-256 | 898733d0d5a029cb5fe9e19380791ce9bfd8887790223d70454de2f7cdaa4e8a |