This is a library for effective moderation of content.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

FlacSSy

These details have not been verified by PyPI

Project links

Project description

🚫 BadWords

High-performance profanity filter for Python, Rust, and JavaScript (WebAssembly)
with multilingual support and evasion detection.

Installation • Quick Start • Benchmarks • Supported Languages • Evasion Detection • Documentation

---

📖 Description

BadWords is a sophisticated profanity filtering library designed to clean up user-generated content. Unlike simple keyword matching, it uses similarity scoring, homoglyph detection, and transliteration to catch even the most cleverly disguised insults.

Architecture: The core is implemented in Rust for performance. Python provides a thin API layer with full type hints for IDE/linter support. The Rust library can also be used directly from Rust projects.

📦 Installation

Requirements

Recommended: Python 3.13
Minimum: Python 3.10+

Install via GitHub

pip install git+[https://github.com/FlacSy/badwords.git](https://github.com/FlacSy/badwords.git)

Install via PyPI

pip install badwords-py

⚡ Quick Start

Basic Initialization

from badwords import ProfanityFilter

# Initialize filter
p = ProfanityFilter()

# Load specific languages (e.g., English and Russian)
p.init(languages=["en", "ru"])

# Or load ALL 26+ supported languages
p.init()

Checking and Filtering Text

text = "Some very b4d text here"

# 1. Simple check (Returns Boolean)
is_bad = p.filter_text(text)
print(is_bad) # True

# 2. Censoring text (Returns String)
clean_text = p.filter_text(text, replace_character="*")
print(clean_text) # "Some very *** text here"

⏱ Benchmarks

CPU	GPU	RAM	OS
x86_64 i7 Intel® Core™ i7-10700KF × 16	NVIDIA GeForce RTX™ 3070	64 GB DDR4 3200MHz	Ubuntu 24.04.2 LTS

Rule-based matching (en+ru, match_threshold=1.0). Run: make bench

Scenario	Rust (badwords-core)	Python (badwords-py)
Clean text (no match)	~7.6 µs (~130 K/s)	~7.7 µs (~130 K/s)
Bad word (match)	~3.1 µs (~320 K/s)	~2.7 µs (~370 K/s)
Censor (replace)	~2.8 µs (~360 K/s)	~2.5 µs (~400 K/s)
5 texts batch	~15 µs (~330 K/s)	~16 µs (~310 K/s)

Python uses Rust via PyO3, overhead minimal.

vs glin-profanity

Rule-based mode, en+ru. Run: make bench-compare (requires pip install glin-profanity)

Scenario	BadWords	glin-profanity
Clean text	~7 µs (~140 K/s)	~4.4 ms (~230/s)
Bad word	~1.3 µs (~770 K/s)	~0.2 ms (~5 K/s)
Censor	~1.8 µs (~560 K/s)	~1.4 ms (~700/s)
5 texts batch	~16 µs (~310 K/s)	~10 ms (~500/s)

BadWords is ~100–600× faster (Rust core vs pure Python).

ML mode

pip install glin-profanity[ml] + make bench-compare. 100 iter each.

Scenario	BadWords ML (ONNX)	glin transformer
Clean text (43 chars)	~6.5 ms (~150/s)	~27 ms (~37/s)
Bad word (8 chars)	~4.6 ms (~220/s)	~21 ms (~47/s)
5 texts batch (82 chars)	~24 ms (~210/s)	~107 ms (~47/s)

BadWords ML (XLM-RoBERTa) ~3–4× faster than glin transformer.

🛠 Methods & API

`filter_text(text, match_threshold=1.0, replace_character=None)`

The core method of the library.

Parameter	Type	Default	Description
`text`	`str`	Required	Input text to check.
`match_threshold`	`float`	`1.0`	Similarity threshold (1.0 = exact match, 0.95 = fuzzy).
`replace_character`	`str/None`	`None`	If provided, returns censored string. If None, returns bool.

[!WARNING] Performance Tip: Using match_threshold < 1.0 enables fuzzy matching which is slower. Use 1.0 for high-traffic real-time filtering, or 0.95 for a good balance.

🧩 Advanced Evasion Detection

Standard filters are easy to bypass. BadWords is built to detect:

Homoglyphs: Detects hеllo (using Cyrillic 'е') or h4llo (numbers).
Transliteration: Automatically handles mapping between Cyrillic and Latin alphabets.
Normalization: Strips diacritics, special characters, and decorative Unicode symbols.
Similarity Analysis: Uses fuzzy matching to find words with deliberate typos.

Examples of detected evasions:

_filter.filter_text("hеllо")  # Mixed alphabets (Cyrillic + Latin) -> DETECTED
_filter.filter_text("h3ll0")  # Character substitution -> DETECTED
_filter.filter_text("h⍺llo")  # Mathematical/Greek symbols -> DETECTED
_filter.filter_text("привет") # Transliterated matches -> DETECTED

🌍 Supported Languages

BadWords supports 25 languages out of the box:

Code	Language	Code	Language	Code	Language
`en`	English	`ru`	Russian	`ua`	Ukrainian
`de`	German	`fr`	French	`it`	Italian
`sp`	Spanish	`pl`	Polish	`cz`	Czech
`ja`	Japanese	`ko`	Korean	`th`	Thai
`br`	Portuguese (BR)	`da`	Danish	`du`	Dutch
`fi`	Finnish	`gr`	Greek	`hu`	Hungarian
`in`	Indonesian	`lt`	Lithuanian	`no`	Norwegian
`po`	Portuguese	`ro`	Romanian	`sw`	Swedish
`tu`	Turkish

Use p.get_all_languages() in code. Full list with word counts: badwords.flacsy.dev

🚀 Full Integration Example

from badwords import ProfanityFilter

def monitor_chat():
    # Setup for a global chat
    profanity_filter = ProfanityFilter()
    profanity_filter.init(["en", "ru", "de"])
    
    # Custom project-specific banned words
    profanity_filter.add_words(["spam_link_v1", "scam_bot_99"])

    user_input = "Hey! Check out this b.a.d.w.o.r.d"
    
    # Moderate with high accuracy
    is_offensive = profanity_filter.filter_text(user_input, match_threshold=0.95)
    
    if is_offensive:
        print("Message blocked: Contains restricted language.")
    else:
        # Proceed with processing
        pass

if __name__ == "__main__":
    monitor_chat()

🦀 Rust API (badwords-core)

Published on crates.io:

[dependencies]
badwords-core = "2"

use badwords_core::{ProfanityFilter, default_resource_dir};

let resource_dir = default_resource_dir();
let mut filter = ProfanityFilter::new(&resource_dir, true, true, true, true);
filter.init(None).unwrap();
filter.add_words(&["custom".to_string()]);
let (found, _) = filter.filter_text("hello", 1.0, None);

🌐 WebAssembly (JavaScript/TypeScript)

Same Rust code for browser and Node.js, compiled to WASM.

Build

# Browser
make wasm

# Node.js
make wasm-nodejs

Frontend (browser)

<script type="module">
  import init, { ProfanityFilter } from './path/to/badwords_wasm.js';
  await init();
  const filter = new ProfanityFilter();
  console.log(filter.isBad('text'));      // boolean
  console.log(filter.censor('text', '*')); // string
</script>

Backend (Node.js)

const { ProfanityFilter } = require('badwords-wasm');
const filter = new ProfanityFilter();
filter.isBad('hello');           // false
filter.censor('bad word', '*');  // "*** word"
filter.addWords(['custom']);

Optional languages (npm)

Built-in: en and ru. Additional languages via @badwords/languages:

npm install badwords-wasm @badwords/languages

import init, { ProfanityFilter } from 'badwords-wasm';
import de from '@badwords/languages/de';
import ua from '@badwords/languages/ua';

await init();
const filter = new ProfanityFilter();
filter.addWords(de);
filter.addWords(ua);

Available: br, cz, da, de, du, en, fi, fr, gr, hu, in, it, ja, ko, lt, no, pl, po, ro, ru, sp, sw, th, tu, ua. See @badwords/languages.

Examples: examples/wasm/browser/, examples/wasm/node/

🔧 Building from source

Requires: Rust, Python, maturin

python -m venv .venv && source .venv/bin/activate  # Linux/macOS
pip install maturin
make develop
# or: cd python && maturin build && pip install target/wheels/badwords_py-*.whl

🌐 WebAssembly (browser & Node.js)

Build the WASM package (requires wasm-pack):

cargo install wasm-pack
make wasm

Output: rust/badwords-wasm/pkg/ (npm package badwords-wasm)

Browser: Use the generated JS with a bundler or static server. See examples/wasm/browser/
Node.js: import init, { ProfanityFilter } from 'badwords-wasm' after npm install. See examples/wasm/node/
Publish to npm: make wasm or make wasm-nodejs, then make npm-publish
Optional languages: @badwords/languages — make lang-packages then make npm-publish-languages

📚 Documentation

Full documentation (Python, Rust, JavaScript) with examples and API reference: badwords.flacsy.dev (EN / RU).

🤝 Contributing

Contributions are what make the open-source community an amazing place to learn, inspire, and create.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

Distributed under the MIT License. See LICENSE for more information.

_{Developed with ❤️ by FlacSy}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

FlacSSy

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.3.1

Mar 3, 2026

2.2.0

Mar 2, 2026

2.1.0

Feb 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

badwords_py-2.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (531.6 kB view details)

Uploaded Mar 3, 2026 CPython 3.12manylinux: glibc 2.17+ x86-64

badwords_py-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (531.6 kB view details)

Uploaded Mar 3, 2026 CPython 3.11manylinux: glibc 2.17+ x86-64

badwords_py-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (531.6 kB view details)

Uploaded Mar 3, 2026 CPython 3.10manylinux: glibc 2.17+ x86-64

badwords_py-2.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (531.8 kB view details)

Uploaded Mar 3, 2026 CPython 3.9manylinux: glibc 2.17+ x86-64

badwords_py-2.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (532.0 kB view details)

Uploaded Mar 3, 2026 CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file badwords_py-2.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: badwords_py-2.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 3, 2026
Size: 531.6 kB
Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for badwords_py-2.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`83e8980446e4bf86b360cbf9bdc0360f2a276517fd5265b79bec64e92ee94b9d`
MD5	`cd055a57f620eb67fa6f77982e5654dd`
BLAKE2b-256	`564d08b32b38b140b0f284ea3c809b9e4d409ae0eca1caf4110645ea1bf5fbcd`

See more details on using hashes here.

File details

Details for the file badwords_py-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: badwords_py-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 3, 2026
Size: 531.6 kB
Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for badwords_py-2.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`d056c6e7703b73439727e4c30f3a620c87fb302460a22e69467849eb0c726dc6`
MD5	`8ff734ae5015fe71b7dceb1caa6a5f53`
BLAKE2b-256	`01db7c2a39a68c7f700d5ec297f13cc679facdb0621285d6296c09794681de39`

See more details on using hashes here.

File details

Details for the file badwords_py-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: badwords_py-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 3, 2026
Size: 531.6 kB
Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for badwords_py-2.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`9f182cedeeebb9f083c2b159ee902876b6d7ae3f845fd6e0e546056b2e85cb4d`
MD5	`297ea12b7f7854b43d53c09cea1f4edf`
BLAKE2b-256	`e0078afd7905269e1655fbe7f744c726916b867b17373a17f310283ed4f97d15`

See more details on using hashes here.

File details

Details for the file badwords_py-2.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: badwords_py-2.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 3, 2026
Size: 531.8 kB
Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for badwords_py-2.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`1f2c72c539bb8b70fe472a1bde76a7d299cd8fbcfa5433592f05124147693a80`
MD5	`f4d609c425fffd3cf62c6820376779c4`
BLAKE2b-256	`673c3054ed014ec8e5d904cd1802610c08d25267837017b17ddc7709e8c9a998`

See more details on using hashes here.

File details

Details for the file badwords_py-2.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: badwords_py-2.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 3, 2026
Size: 532.0 kB
Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for badwords_py-2.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`ffa91778648c958cf43830e44d85305933cecbd54abc044a2702e93532f61f8f`
MD5	`1248da945126905a28bee36ccb7a9a36`
BLAKE2b-256	`6f92a29bc2f0736c43523ed2822d1280f2ec501190a174c559221fda10d57ebe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for badwords_py-2.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on FlacSy/BadWords

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: badwords_py-2.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Subject digest: ffa91778648c958cf43830e44d85305933cecbd54abc044a2702e93532f61f8f
- Sigstore transparency entry: 1018666154
- Sigstore integration time: Mar 3, 2026
Source repository:
- Permalink: FlacSy/BadWords@ce4c51f6ea47b572e63896275ecadc4b7dcfb281
- Branch / Tag: refs/tags/v2.3.1
- Owner: https://github.com/FlacSy
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ce4c51f6ea47b572e63896275ecadc4b7dcfb281
- Trigger Event: release

badwords-py 2.3.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🚫 BadWords

📖 Description

📦 Installation

Requirements

Install via GitHub

Install via PyPI

⚡ Quick Start

Basic Initialization

Checking and Filtering Text

⏱ Benchmarks

vs glin-profanity

ML mode

🛠 Methods & API

filter_text(text, match_threshold=1.0, replace_character=None)

🧩 Advanced Evasion Detection

Examples of detected evasions:

🌍 Supported Languages

🚀 Full Integration Example

🦀 Rust API (badwords-core)

🌐 WebAssembly (JavaScript/TypeScript)

Build

Frontend (browser)

Backend (Node.js)

Optional languages (npm)

🔧 Building from source

🌐 WebAssembly (browser & Node.js)

📚 Documentation

🤝 Contributing

📄 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

Provenance

`filter_text(text, match_threshold=1.0, replace_character=None)`