Skip to main content

full set of words that trigger toxicity

Project description

Toxic Triggers

Toxic Triggers is a Python package that provides access to a dataset of text triggers that have been categorized as offensive or harmless to minorities. This package can be useful for researchers and developers who want to understand how offensive language can affect marginalized groups and how it can be detected and prevented.

Features

  • Access to a dataset of text triggers categorized as offensive or harmless to minorities
  • Provides easy access to subsets of the dataset based on categorization
  • Built with Pandas for easy data manipulation
  • Analyze the text for the presence of toxic triggers

Installation

To install the package, you can use pip install toxicTrig

Usage

Here's an example of how to use the package:

from toxicTrig import toxicTrig
import pandas as pd #type: ignore
import time

df = pd.read_csv("example_data/ND_founta_trn_dial_pAPI.csv")
text = df["tweet"].tolist() # A list of strings as the input

tt = toxicTrig()
start_time = time.time()
tt.text_analysis(text, batch_size=100)
end_time = time.time()

time_taken = round(end_time - start_time, 2)
print("Total time taken: ", time_taken, " seconds")

# Count tokens in text
total_tokens = sum([len(t.split()) for t in text])
print("Total tokens in text: ", total_tokens)

Contribution

Install dev options

pip install -e ".[dev]"
mypy --install-types --non-interactive toxicTrig
pip install pre-commit
pre-commit install

New branch for each feature

git checkout -b feature/feature-name and PR to main branch.

Before committing

Run pytest to make sure all tests pass (this will ensure dynamic typing passed with beartype) and mypy --strict . to check static typing. (You can also run pre-commit run --all-files to run all checks)

Citation

If you use this package in your research, please cite the following paper:

@inproceedings{zhou-etal-2020-debiasing,
  title = {Challenges in Automated Debiasing for Toxic Language Detection},
  author = {Zhou, Xuhui and Sap, Maarten and Swayamdipta, Swabha and Choi, Yejin and Smith, Noah A.},
  booktitle = {EACL},
  year = {2021},
}

License

This package is released under the MIT License. See the LICENSE file for more details.

Credits

This package was developed by Xuhui Zhou and Maarten Sap.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toxicTrig-0.3.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

toxicTrig-0.3.0-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file toxicTrig-0.3.0.tar.gz.

File metadata

  • Download URL: toxicTrig-0.3.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for toxicTrig-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c7bfb688e8f31fa50405756f684bde974906332d00b487992ee53356d653389c
MD5 6e0250cb357782133a2da0a6a74d09b7
BLAKE2b-256 b15f7a703c2746e2644ed4fb26787aa0c46e39493a5a634579290bc39b3c5db4

See more details on using hashes here.

File details

Details for the file toxicTrig-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: toxicTrig-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for toxicTrig-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ea2f50e8669e4191f26c07eeb2064682bd34683ef826d1afa60bf54fd10f960c
MD5 b0e924f1569b9d56ba641e1cf981eda6
BLAKE2b-256 d9fb973e90f3e338cd308615cd0cefeabee1283e80250cdecb7cabdc550dc44d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page