A software that uses an curated open-source keyword list to detect hate.

These details have not been verified by PyPI

Project links

Project description

Red Flagger 🙅🚩

A simple, dependency-free software that does keyword-based abuse flagging. It is designed with a philosophy of sensitivity and a focus on recall rather than precision. It detects from a term list that contain terms that are discovered through lexicons and corpora containing hate speech, chatbot system abuse, toxicity, violence, and/or general profinity/obscenity.

🚨🚨 Important Note on Keyword Detection Systems 🚨🚨

We highly discourage the use of keyword systems as the only line of defense for toxicity and/or abuse detection. Problems associated with keyword detection systems range from performance issues to socially harmful false positives in cases like reclaimation. This system is encouraged to be used in conjunction with a review process and/or a stronger classification systems such as a deep learning model.

Examples of suggested, more robust use cases:

Keyword detection as part of an ensemble.
Keyword detection -> Deep learning classification on positives (Useful when there's a lot of data and not a lot of compute).
Using detected keywords as BOW features and then training classical models on said features.

For performance, see evaluation/.

Usage 🔨

For basic usage, all you have to do is:

$ pip install red-flagger

from red_flagger import RedFlagger

rf = RedFlagger()
document = "Something hateful"
hate_words = rf.detect_abuse(document)

There's also a method to get a bag-of-words from the wordlist:

...

hate_bow = rf.get_abuse_vector(document)

The library is designed to work with other word lists that are not built-in to the library. This can be managed with the add_words and remove_words methods.

Directory 📁

abuse_flagger/ contains the package code and the main logic.
data_building/ contains the code and documentation of how the initial dataset was created.
evaluation/ contains the evaluation of the keyword system on the test splits of the corpora used to discover the keywords.

Dataset Obscurity 😶‍🌫️

The word list is obscured as a base16 representation of the list of hate words. We did not feel comfortable exposing this list and we discourage any non-base16 representations of the wordlist being uploaded elsewhere. For details on dataset creation, see README.md in data_building/.

Other Resouces 📚

We acknowledge that there are other great resources and link some of them below:

Contributions 🤝

We welcome contributions for both the word list and the software using a fork-and-pull model. For additions to the word list, please ensure there are no duplicates or overlaps and that the additional data is obscured in base16. For any new features for the RedFlagger, open an issue for discussion first before opening a pull request.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Apr 25, 2025

This version

0.1.0

Apr 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rfwc-0.1.0.tar.gz (34.1 kB view details)

Uploaded Apr 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rfwc-0.1.0-py3-none-any.whl (33.4 kB view details)

Uploaded Apr 25, 2025 Python 3

File details

Details for the file rfwc-0.1.0.tar.gz.

File metadata

Download URL: rfwc-0.1.0.tar.gz
Upload date: Apr 25, 2025
Size: 34.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for rfwc-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e149c35424f64f7a2c2c539d0ff19d671ad760d45ed4f85c1fdf94bcbab70856`
MD5	`e70097edff48cb0f5a9e3cb814b88df9`
BLAKE2b-256	`590218331571688b5f4bf496f44490203740c12445f944cc620a12a923b2c646`

See more details on using hashes here.

File details

Details for the file rfwc-0.1.0-py3-none-any.whl.

File metadata

Download URL: rfwc-0.1.0-py3-none-any.whl
Upload date: Apr 25, 2025
Size: 33.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for rfwc-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2ccdcea20e171901f7e013740300bd18475f1f98ebe3f99bce7102513708a04b`
MD5	`11216afc100e814026fc3c0c9aa52d59`
BLAKE2b-256	`e5e1b3764ded1b16307048c87bb7b359078498f3adac0952232c0b7dbff5dd6c`

See more details on using hashes here.

rfwc 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Red Flagger 🙅🚩

🚨🚨 Important Note on Keyword Detection Systems 🚨🚨

Usage 🔨

Directory 📁

Dataset Obscurity 😶‍🌫️

Other Resouces 📚

Contributions 🤝

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes