Skip to main content

No project description provided

Project description

Joint Disfluency Detection and Constituency Parsing

A joint disfluency detection and constituency parsing model for transcribed speech based on Neural Constituency Parsing of Speech Transcripts from NAACL 2019, with additional changes (e.g. self-training and ensembling) as described in Improving Disfluency Detection by Self-Training a Self-Attentive Model from ACL 2020.

This repository updated the original repository to focus on inferencing using the pretrained swbd_fisher_bert_Edev.0.9078.pt model.

Installation

$ pip install disfluency-constituency-parser

Usage

$ wget https://github.com/pariajm/joint-disfluency-detector-and-parser/releases/download/naacl2019/swbd_fisher_bert_Edev.0.9078.pt
$ wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
$ wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
from dc_parser import DC_Model
model = DC_Model(model_path = "/path/to/swbd_fisher_bert_Edev.0.9078.pt",
                bert_model_path = "/path/to/bert-base-uncased.tar.gz",
                bert_vocab_path = "/path/to/bert-base-uncased-vocab.txt",)
model.parse(["Today is a very good day!"])

Citation

If you use this model, please cite the following papers:

@inproceedings{jamshid-lou-2019-neural,
    title = "Neural Constituency Parsing of Speech Transcripts",
    author = "Jamshid Lou, Paria and Wang, Yufei and Johnson, Mark",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = "June",
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/N19-1282",
    doi = "10.18653/v1/N19-1282",
    pages = "2756--2765"
}
@inproceedings{jamshid-lou-2020-improving,
    title = "Improving Disfluency Detection by Self-Training a Self-Attentive Model",
    author = "Jamshid Lou, Paria and Johnson, Mark",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = "jul",
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.346",
    pages = "3754--3763"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

disfluency-constituency-parser-0.0.6.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file disfluency-constituency-parser-0.0.6.tar.gz.

File metadata

File hashes

Hashes for disfluency-constituency-parser-0.0.6.tar.gz
Algorithm Hash digest
SHA256 27b1ab860d848a2ea53c91f64e4bb90a1baa519cf4e2b16537dd1d9bcd25d166
MD5 96802fc5bbc63965974b8178d5a92666
BLAKE2b-256 4b9d6013965ee1fa153db247f8cb7181816df1ce97954cbe2956815a9737582e

See more details on using hashes here.

File details

Details for the file disfluency_constituency_parser-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for disfluency_constituency_parser-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 22e2cd11b2280d133cf64e2a9061511a1b80226cca733087e83bbf2a1b88cf30
MD5 9d28f23eaf93f291031623be6fe66d6a
BLAKE2b-256 4703667afd595451fb17b30f43a89409fc69a9ee3c5ca56e33c5d2d026feacff

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page