Skip to main content

Fast, deterministic profanity filter using WebAssembly - works across all languages

Project description

Gangajal — Global Profanity Filtering Engine

Secure • Fast • Cross-language • No raw bad words in repo

Gangajal is a deterministic, high-performance profanity filter that works in every programming language via WebAssembly.
It uses Unicode normalization, Bloom filters, and SHA-256 hashing — keeping your dictionary 100% private while delivering lightning-fast filtering.

GitHub stars PyPI version

✨ Features

  • Supports all world languages (Unicode NFKC + letters + combining marks)
  • Zero raw profanity words ever in the repository
  • Extremely fast (Bloom filter + binary search)
  • One WASM core — works in Node.js, Python, Go, .NET, Java, Rust, browsers, etc.
  • Deterministic & reproducible
  • Easy admin tools to update the dictionary safely

🚀 Quick Start

Install

pip install gangajal

Alternative: Install from GitHub release

  1. Download gangajal-python.tar.xz from Releases
  2. Extract it
  3. pip install ./gangajal-python

Usage

from gangajal import validate, reload_assets

# Full mask mode (0): masks entire word
print(validate("hello badword here", 0))  # "hello ******** here"

# Partial mask mode (1): keeps first char  
print(validate("hello badword here", 1))  # "hello b****** here"

# Reload assets without restarting Python
reload_assets()

### Modes

- `mode=0`: Full mask - replaces entire word with `*` (e.g., `badword`  `*******`)
- `mode=1`: Partial mask - keeps first character (e.g., `badword`  `b******`)

---

## Architecture

- **Admin Tools** (private)  generate safe binary assets  
- **Binary Assets** (public)  `badwords.bloom` + `badwords.hash.bin`  
- **WASM Core**  same engine everywhere  
- **Language Bindings**  JS, Python, Go, .NET, etc.

Full specification  [SPEC.md](https://github.com/SaugatEDITH/gangajal/blob/master/SPEC.md)

---

## License

MIT © Saugat

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gangajal-0.1.0.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gangajal-0.1.0-py3-none-any.whl (2.3 MB view details)

Uploaded Python 3

File details

Details for the file gangajal-0.1.0.tar.gz.

File metadata

  • Download URL: gangajal-0.1.0.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for gangajal-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3632319be10e4c07dcf8fbb9f1e54b7b576be20d28c9fc22536997224b596782
MD5 02c2f90cbbed141bb66223f23502e44e
BLAKE2b-256 58d48176264d8dc0b1520744fc24038c8902112bed7bd48b39f68f9a2f2e2ee4

See more details on using hashes here.

File details

Details for the file gangajal-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: gangajal-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for gangajal-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9928534928db326b96e50b844ff63a33710b55e27ce15dcdd6adb91b87a74c98
MD5 50a92651a5b766310b399cc0fb7127fe
BLAKE2b-256 10102cc3fc015a5b42a23d09c635f1445d665ad86c684da6b50c94fd0a7fd79b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page