Skip to main content

Template component for Leolani

Project description

cltl-dialogueclassification

Description

Detects dialogue acts in texts and annotates the signals with the dialogue act labels and scores. The annotations are pushed to the event bus and can be taken up for further processing.

We implemented three dialogue act classifiers:

  1. Deberta fine-tuned with the SILICONE data set:

Based on: https://huggingface.co/diwank/silicone-deberta-pair

  1. RoBERTa fine-tined with the MIDAS data set:

Based on: https://github.com/DianDYu/MIDAS_dialog_act

  1. XLM-RoBERTa fine-tuned with the MIDAS data set:

Based on: https://github.com/DianDYu/MIDAS_dialog_act

Getting started

Prerequisites

This repository uses Python >= 3.9

Be sure to run in a virtual python environment (e.g. conda, venv, mkvirtualenv, etc.)

Installation

  1. In the root directory of this repo run

    pip install -e .
    
  2. Download the fine-tuned RoBERTA model from:

https://vu.data.surfsara.nl/index.php/s/xLou1DPl739Lbq6

and put the file "classifier.pt" in the directory:

resources/midas-da-roberta

Alternatively, download the XLM-roberta from:

https://vu.data.surfsara.nl/index.php/s/dw0YCJAVFM870DT

and put the "pytorchmodel.bin" in the directory:

resources/midas-da-xlmroberta

Usage

To apply this to emissor conversations:

```bash
    python3 examples/annotato_emissor_conversation_with_emotions.py --emissor "../data/emissor"
```

For using this repository as a package different project and on a different virtual environment, you may

  • install a published version from PyPI:

    pip install cltl.dialogue_act_classification
    
  • or, for the latest snapshot, run:

    pip install git+git://github.com/leolani/cltl-dialogueclassification.git@main
    

Then you can import it in a python script as:

import cltl.dialogue_act_classification

To test the classifier run:

PYTHONPATH=src python -m unittest

References:

  • Chapuis, Emile, Pierre Colombo, Matteo Manica, Matthieu Labeau, and Chloe Clavel. "Hierarchical pre-training for sequence labelling in spoken dialog." arXiv preprint arXiv:2009.11152 (2020).
  • Yu, Dian, and Zhou Yu. "Midas: A dialog act annotation scheme for open domain human machine spoken conversations." arXiv preprint arXiv:1908.10023 (2019).

Integration in the Leolani event-bus

Can be integrated in the event-bus and to generate annotations in EMISSOR through a service.py that is included. In the configuration file of the event-bus,the input and output topics need to specified as well as the emotion detectors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cltl_dialogueclassification-0.1.dev1.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file cltl_dialogueclassification-0.1.dev1.tar.gz.

File metadata

File hashes

Hashes for cltl_dialogueclassification-0.1.dev1.tar.gz
Algorithm Hash digest
SHA256 7aad8cb90adf7cceead05bf4a0c434fa3b1b41b1abafeee715b6072458af0598
MD5 f7d3c0e8e8309b64d105e54ae75adbe8
BLAKE2b-256 eb2decd24bdf8f6a2fc71ac570b15f0d192b41f9ac5d8c270242a080dfe33d09

See more details on using hashes here.

File details

Details for the file cltl.dialogueclassification-0.1.dev1-py3-none-any.whl.

File metadata

File hashes

Hashes for cltl.dialogueclassification-0.1.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 fa98a9cf680f3742a4e85ae141d723f83fb94bccdac247f99ea4b55c8b4abd62
MD5 8bf337f5993be920a70b6f137acc2644
BLAKE2b-256 2d389ffbeab2ae5e2246627b8c88c9bd34684f86d9fe2d8a7c3964d784a84ef0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page