Fast profanity filtering tool for multiple languages
Project description
🤔 why safetext?
Detect. Filter. Protect.
- Effortless Profanity Management: Instantly identify and censor profanity with just one line of code.
- Multilingual Capability: Fluent in five languages, designed for easy expansion.
- Optimized for Content Moderation: Perfect for efficiently moderating and cleaning up text in various applications.
- Automated: Smart language detection for quick setup.
📦 installation
easily install safetext with pip:
pip install safetext
🎯 quickstart
check and censor profanity
>>> from safetext import SafeText
>>> st = SafeText(language='en')
>>> results = st.check_profanity(text='Some text with <profanity-word>.')
>>> results
{'word': '<profanity-word>', 'index': 4, 'start': 15, 'end': 31}
>>> text = st.censor_profanity(text='Some text with <profanity-word>.')
>>> text
"Some text with ***."
using whitelist
exclude specific words from profanity detection:
# Using a list of words
>>> st = SafeText(language='en', whitelist=['word1', 'word2'])
# Using a file (one word per line)
>>> st = SafeText(language='en', whitelist='path/to/whitelist.txt')
automated language detection
- from text:
>>> from safetext import SafeText
>>> eng_text = "This story is about to take a dark turn."
>>> st = SafeText(language=None)
>>> st.set_language_from_text(eng_text)
>>> st.language
'en'
- from .srt (subtitle) file:
>>> from safetext import SafeText
>>> turkish_srt_file_path = "turkish.srt"
>>> st = SafeText(language=None)
>>> st.set_language_from_srt(turkish_srt_file_path)
>>> st.language
'tr'
🌍 supported languages
safetext currently supports profanity detection in 13 languages:
Language | ISO 639-1 Code | Language Name |
---|---|---|
🇸🇦 | ar |
Arabic |
🇦🇿 | az |
Azerbaijani |
🇩🇪 | de |
German |
🇬🇧 | en |
English |
🇪🇸 | es |
Spanish |
🇮🇷 | fa |
Persian (Farsi) |
🇫🇷 | fr |
French |
🇮🇳 | hi |
Hindi |
🇯🇵 | ja |
Japanese |
🇵🇹 | pt |
Portuguese |
🇷🇺 | ru |
Russian |
🇹🇷 | tr |
Turkish |
🇨🇳 | zh |
Chinese |
🤝 contribute to safetext
join our mission in refining content moderation!
contribute by:
- adding new languages: create a folder with the ISO 639-1 code and include a
words.txt
. - enhancing word lists: improve detection accuracy.
- sharing feedback: your ideas can shape
safetext
.
see our contributing guidelines for more.
🏆 contributors
meet our awesome contributors who make safetext better every day!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
safetext-0.2.0.tar.gz
(87.6 kB
view details)
Built Distribution
safetext-0.2.0-py3-none-any.whl
(79.9 kB
view details)
File details
Details for the file safetext-0.2.0.tar.gz
.
File metadata
- Download URL: safetext-0.2.0.tar.gz
- Upload date:
- Size: 87.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
e59b09caefa72634ddaad3cc2d914cc8917b1eb481b34fd0e9dde03cfd827566
|
|
MD5 |
3124df1fe429f7e20ea39cc19a3296a9
|
|
BLAKE2b-256 |
54e1ddbcd392ecfcb6bde07363d8993fe38bf3ded3b6874660ce379b02f4938e
|
File details
Details for the file safetext-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: safetext-0.2.0-py3-none-any.whl
- Upload date:
- Size: 79.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
25227df30ef9f668a72191b221e49c1b5ac1b9b3148b2365f09c0fe5511a5413
|
|
MD5 |
68ca60b1d48d603353967a9d53c02337
|
|
BLAKE2b-256 |
a7d17551d271050995752c88ec640b369a616d8318cc723c6d6f3a421cb55107
|