A package for quantifying bias in Danish language models.
Project description
The GenDa Lens:
Quantifying Gender Bias in Danish language models
Thea Rolskov Sloth & Astrid Sletten Rybner
A python package for investigating gender bias in Danish language models within the following domains:
-
Language Modeling (for pre-trained models)
-
Coreference Resolution (for coref. models)
-
Named Entity Recogntiion (for NER models)
If you want to test either a pre-trained model, a coref. model or a NER model, you can read more about each of these three types of tests in the User Guide.
Here you can also find a section on the defintions of harm, gender and bias that we adopt in the GenDa Lens package.
🔎 Documentation
Documentation | |
---|---|
📚 User Guide | Instructions on how to understand the implemented Gender Bias tests |
💡 Definitions | Defintions of harm, bias and gender applied in GenDa Lens |
💻 API References | The detailed reference for the GenDa lens API. Including function documentation |
🧐 About | Learn more about how this project came about and who is behind the implemented frameworks |
🤗 Integration
Note that for NER and Language Modeling, the GenDa Lens evaluator is integrated with Hugging Face.
🔧 Installation
You can install GenDa Lens via pip from PyPI:
pip install genda_lens
👩💻 Usage
You can test your model by instatiating an instance of the Evaluator and running the appriate evaluation function:
from genda_lens import Evaluator
# initiate evaluator
ev = Evaluator(model_name="huggingface-modelname")
# run test
output = ev.evaluate_ner(n=20)
# retrieve output
simple_output = output[0]
detailed_output = output[1]
Subsequently, the output can be visualized using the Visualizer:
from genda_lens import Visualizer
# initiate visualizer
viz = Visualizer()
# visualize ner results
plot = viz.visualize_results(data = detailed_output_ner, framework = "ner", model_name "my-model-name")
Acknowledgements
This project uses code from three already implemented frameworks for quantifying gender bias in Danish. While all code written by others is properly attributed at the top of the scripts in the repository, we would also like to present aknowledgement here to the authors of the work we draw on:
-
The original ABC Framework: González, A. V., Barrett, M., Hvingelby, R., Webster, K., & Søgaard, A. (2020). Type B reflexivization as an unambiguous testbed for multilingual multi-task gender bias.
-
The original Augmented DaNe Framework: Lassen, I. M., Almasi, M., Enevoldsen, K., & Kristensen-mclachlan, R. (2023, May). Detecting intersectionality in NER models: A data-driven approach.
-
The original WinoBias Framework: Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2018). Gender bias in coreference resolution: Evaluation and debiasing methods.
-
The Danish translation of the WinoBias Framework, DaWinoBias: Signe Kirk and Kiri Koppelgaard
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for genda_lens-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 744f2fa87b1a770c96654aaf22ea90844179eb523654c39df7acf75974a6fc2f |
|
MD5 | 8c0420bc6a3e89993f0b42e5833cfbd9 |
|
BLAKE2b-256 | ce928813916cfad8c36e1a6aac895b53dcb5fcab118dcd5288ae2b357c1a1363 |