A library to detect undesired, unbranded, or harmful content
Project description
unwanted_content_detector
A library to detect undesired, unbranded, or harmful content
Usage
In python:
pip install unwanted-content-detector
Minimal
from unwanted_content_detector import Detector
detector = Detector(models=['hatefult_content_generic_distil_bert_finetuned'])
if detector.is_unwanted('content generated by llm'):
print("Wont continue")
With spark
spark_df.with_column('is_rejected', lambda row: detector.is_unwanted)
In the terminal
./cli.py inference infer 'text to be validated'
Training
Fine tunning
from unwanted_content_detector import Detector
model = Detector({'data_source': df}).train()
./cli.py train
Target Architecture / Features
- multiple Swappable models
- multiple evaluation datasets
- possibility of configuring a custom personal dataset to fine tune
- Single performance evaluation criteria
Use cases it could be applied to
- detecting the generation of harmful content from LLMs
- preventing harmful prompts to be injected into LLMs
- using it as a validator of content being generated according to the brand guidelines
Liability
This tool aims to help you to detect harmful content but it is not meant to be used as the final decision maker alone.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for unwanted_content_detector-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b0e2200bd394baab2473733a59baaadaf93ef36eec9190285565ca3c7847ca3 |
|
MD5 | 4667911f0c346e4fc3b6902c46709b0e |
|
BLAKE2b-256 | d37d9c19b5170e1258d345b402f78cf41a7e6ae7d3e4506450374e9a53962b28 |
Close
Hashes for unwanted_content_detector-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55a08b5df145e3ac3c83f9f12bd38450f12484b3446cfc94941678092146aa26 |
|
MD5 | 50a034fdc0aa9558f2c41e5d44887614 |
|
BLAKE2b-256 | 876744ab6a5c83d7e874aede7ea5ad3ac9a6c21234ba3d4d9e8417c9e532225a |