Template component for Leolani
Project description
cltl-dialogueclassification
Description
Detects dialogue acts in texts and annotates the signals with the dialogue act labels and scores. The annotations are pushed to the event bus and can be taken up for further processing.
We implemented three dialogue act classifiers:
- Deberta fine-tuned with the SILICONE data set:
Based on: https://huggingface.co/diwank/silicone-deberta-pair
- RoBERTa fine-tined with the MIDAS data set:
Based on: https://github.com/DianDYu/MIDAS_dialog_act
- XLM-RoBERTa fine-tuned with the MIDAS data set:
Based on: https://github.com/DianDYu/MIDAS_dialog_act
Getting started
Prerequisites
This repository uses Python >= 3.9
Be sure to run in a virtual python environment (e.g. conda, venv, mkvirtualenv, etc.)
Installation
-
In the root directory of this repo run
pip install -e .
-
Download the fine-tuned RoBERTA model from:
https://vu.data.surfsara.nl/index.php/s/xLou1DPl739Lbq6
and put the file "classifier.pt" in the directory:
resources/midas-da-roberta
Alternatively, download the XLM-roberta from:
https://vu.data.surfsara.nl/index.php/s/dw0YCJAVFM870DT
and put the "pytorchmodel.bin" in the directory:
resources/midas-da-xlmroberta
Usage
To apply this to emissor conversations:
```bash
python3 examples/annotato_emissor_conversation_with_emotions.py --emissor "../data/emissor"
```
For using this repository as a package different project and on a different virtual environment, you may
-
install a published version from PyPI:
pip install cltl.dialogue_act_classification
-
or, for the latest snapshot, run:
pip install git+git://github.com/leolani/cltl-dialogueclassification.git@main
Then you can import it in a python script as:
import cltl.dialogue_act_classification
To test the classifier run:
PYTHONPATH=src python -m unittest
References:
- Chapuis, Emile, Pierre Colombo, Matteo Manica, Matthieu Labeau, and Chloe Clavel. "Hierarchical pre-training for sequence labelling in spoken dialog." arXiv preprint arXiv:2009.11152 (2020).
- Yu, Dian, and Zhou Yu. "Midas: A dialog act annotation scheme for open domain human machine spoken conversations." arXiv preprint arXiv:1908.10023 (2019).
Integration in the Leolani event-bus
Can be integrated in the event-bus and to generate annotations in EMISSOR through a service.py that is included. In the configuration file of the event-bus,the input and output topics need to specified as well as the emotion detectors.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cltl_dialogueclassification-0.1.dev1.tar.gz
.
File metadata
- Download URL: cltl_dialogueclassification-0.1.dev1.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7aad8cb90adf7cceead05bf4a0c434fa3b1b41b1abafeee715b6072458af0598 |
|
MD5 | f7d3c0e8e8309b64d105e54ae75adbe8 |
|
BLAKE2b-256 | eb2decd24bdf8f6a2fc71ac570b15f0d192b41f9ac5d8c270242a080dfe33d09 |
File details
Details for the file cltl.dialogueclassification-0.1.dev1-py3-none-any.whl
.
File metadata
- Download URL: cltl.dialogueclassification-0.1.dev1-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa98a9cf680f3742a4e85ae141d723f83fb94bccdac247f99ea4b55c8b4abd62 |
|
MD5 | 8bf337f5993be920a70b6f137acc2644 |
|
BLAKE2b-256 | 2d389ffbeab2ae5e2246627b8c88c9bd34684f86d9fe2d8a7c3964d784a84ef0 |