Skip to main content

A library to detect undesired, unbranded, or harmful content

Project description

Unwanted Content Detector

A library to detect undesired, unbranded, or harmful content

Usage

In python:

pip install unwanted-content-detector

With Pandas

import pandas as pd
from unwanted_content_detector import Detector

detector = Detector()
df = pd.DataFrame({"text": [
    "this is hate speech",
    "We should all do our part to protect the environment.",
    'Everyone has the right to love who they want.'
]})

df['is_unwanted'] = df['text'].apply(lambda x: detector.is_unwanted(x))

In the terminal:

unwanted_detector 

To get the manual

Models

Model name size (mb)
distilbert-finetuned 3 gb

Training

unwanted_detector train

Target Architecture / Features

  • multiple Swappable models
  • multiple evaluation datasets
  • possibility of configuring a custom personal dataset to fine tune
  • Single performance evaluation criteria

Use cases it could be applied to

  • detecting the generation of harmful content from LLMs
  • preventing harmful prompts to be injected into LLMs
  • using it as a validator of content being generated according to the brand guidelines

Liability

This tool aims to help you to detect harmful content but it is not meant to be used as the final decision maker alone.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unwanted_content_detector-0.1.1.tar.gz (4.0 kB view hashes)

Uploaded Source

Built Distribution

unwanted_content_detector-0.1.1-py3-none-any.whl (6.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page