Skip to main content

No project description provided

Project description

Joint Disfluency Detection and Constituency Parsing

A joint disfluency detection and constituency parsing model for transcribed speech based on Neural Constituency Parsing of Speech Transcripts from NAACL 2019, with additional changes (e.g. self-training and ensembling) as described in Improving Disfluency Detection by Self-Training a Self-Attentive Model from ACL 2020.

This repository updated the original repository to focus on inferencing using the pretrained swbd_fisher_bert_Edev.0.9078.pt model.

Installation

$ pip install disfluency-constituency-parser

Usage

$ wget https://github.com/pariajm/joint-disfluency-detector-and-parser/releases/download/naacl2019/swbd_fisher_bert_Edev.0.9078.pt
$ wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
$ wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
from dc_parser import DC_Model
model = DC_Model(model_path = "/path/to/swbd_fisher_bert_Edev.0.9078.pt",
                bert_model_path = "/path/to/bert-base-uncased.tar.gz",
                bert_vocab_path = "/path/to/bert-base-uncased-vocab.txt",)
model.parse(["Today is a very good day!"])

Citation

If you use this model, please cite the following papers:

@inproceedings{jamshid-lou-2019-neural,
    title = "Neural Constituency Parsing of Speech Transcripts",
    author = "Jamshid Lou, Paria and Wang, Yufei and Johnson, Mark",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = "June",
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/N19-1282",
    doi = "10.18653/v1/N19-1282",
    pages = "2756--2765"
}
@inproceedings{jamshid-lou-2020-improving,
    title = "Improving Disfluency Detection by Self-Training a Self-Attentive Model",
    author = "Jamshid Lou, Paria and Johnson, Mark",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = "jul",
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.346",
    pages = "3754--3763"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

disfluency-constituency-parser-0.0.5.tar.gz (21.2 kB view details)

Uploaded Source

Built Distributions

disfluency_constituency_parser-0.0.5-py2.py3-none-any.whl (21.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file disfluency-constituency-parser-0.0.5.tar.gz.

File metadata

File hashes

Hashes for disfluency-constituency-parser-0.0.5.tar.gz
Algorithm Hash digest
SHA256 4cb18b08ea68b64c78d486e58fece74b131cc20cc79ddf894f5e0915d391564f
MD5 9028705689b8395323f80c5caf835ff1
BLAKE2b-256 b8bbd16419319737d870c4b4851f6c5f7ff5dc184882e0ba7539808b2c3ef7ed

See more details on using hashes here.

File details

Details for the file disfluency_constituency_parser-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for disfluency_constituency_parser-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c1e8fefe98b20293d9a974504c128795d231f3a49d9640af70bad1477246f805
MD5 cbf340e589b484c30eeb87f27acd36ef
BLAKE2b-256 15e58a12020535c170b1cd043f849dcf09335f9dcf525fe2e081b35edb2b3aa2

See more details on using hashes here.

File details

Details for the file disfluency_constituency_parser-0.0.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for disfluency_constituency_parser-0.0.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 cf8f012070f931d803d10829be1d2592cd8f8796e2ff802ffdb3e8dc5965e913
MD5 e7c4aa740bafab032e5c13aae5917aa7
BLAKE2b-256 ebce5313b092ef800940ad176a547ef15d76182cac265e4496df8a1130d316ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page