Pollutes documents with terms biased on specific geners
Project description
Document Polluter
Overview
Document Polluter replaces gendered words in documents to create test data for machine learning models in order to identify bias.
Checkout the examples in the interactive notebook.
Installation
document-polluter
is available on PyPI
http://pypi.python.org/pypi/document-polluter
Install via pip
$ pip install document-polluter
Install via easy_install
$ easy_install document-polluter
Install from repo
git repo <https://github.com/gregology/document-polluter>
$ git clone --recursive git://github.com/gregology/document-polluter.git
$ cd document-polluter
$ python setup.py install
Basic usage
>>> from document_polluter import DocumentPolluter
>>> documents = ['she shouted', 'my son', 'the parent']
>>> dp = DocumentPolluter(documents=documents, genre='gender')
>>> print(dp.polluted_documents['female'])
['she shouted', 'my daughter', 'the mother']
Running Test
$ python document_polluter/tests.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file document-polluter-0.0.7.tar.gz
.
File metadata
- Download URL: document-polluter-0.0.7.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04bd027e7916937c58e840dacd6ceaa38f715230c18bc879577e4f8dc2ac6e2b |
|
MD5 | 842dce27de6f5efeea935d76d65b80e5 |
|
BLAKE2b-256 | cfe49e6ae2dfe1465ac49d6d856967598af29a6aa973be633a88b4ae0ffa7e0b |