Skip to main content

No project description provided

Project description

Joint Disfluency Detection and Constituency Parsing

A joint disfluency detection and constituency parsing model for transcribed speech based on Neural Constituency Parsing of Speech Transcripts from NAACL 2019, with additional changes (e.g. self-training and ensembling) as described in Improving Disfluency Detection by Self-Training a Self-Attentive Model from ACL 2020.

This repository updated the original repository to focus on inferencing using the pretrained swbd_fisher_bert_Edev.0.9078.pt model.

Installation

$ pip install disfluency-constituency-parser

Usage

$ wget https://github.com/pariajm/joint-disfluency-detector-and-parser/releases/download/naacl2019/swbd_fisher_bert_Edev.0.9078.pt
$ wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
$ wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
from dc_parser import DC_Model
model = DC_Model(model_path = "/path/to/swbd_fisher_bert_Edev.0.9078.pt",
                bert_model_path = "/path/to/bert-base-uncased.tar.gz",
                bert_vocab_path = "/path/to/bert-base-uncased-vocab.txt",)
model.parse(["Today is a very good day!"])

Citation

If you use this model, please cite the following papers:

@inproceedings{jamshid-lou-2019-neural,
    title = "Neural Constituency Parsing of Speech Transcripts",
    author = "Jamshid Lou, Paria and Wang, Yufei and Johnson, Mark",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = "June",
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/N19-1282",
    doi = "10.18653/v1/N19-1282",
    pages = "2756--2765"
}
@inproceedings{jamshid-lou-2020-improving,
    title = "Improving Disfluency Detection by Self-Training a Self-Attentive Model",
    author = "Jamshid Lou, Paria and Johnson, Mark",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = "jul",
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.346",
    pages = "3754--3763"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

disfluency-constituency-parser-0.0.4.tar.gz (21.2 kB view details)

Uploaded Source

Built Distributions

disfluency_constituency_parser-0.0.4-py2.py3-none-any.whl (21.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file disfluency-constituency-parser-0.0.4.tar.gz.

File metadata

File hashes

Hashes for disfluency-constituency-parser-0.0.4.tar.gz
Algorithm Hash digest
SHA256 85b29f131daa4a3bf86aa9a7e933d447bb66d9f492dd1bbe4e58c9a3c98f2a34
MD5 c5111f10d288f48b7e0f12bf321725a1
BLAKE2b-256 239f1baabcba0c7da5d995f6cc45d95c2a43ea5b869b3f1100f2e2de5a976931

See more details on using hashes here.

File details

Details for the file disfluency_constituency_parser-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for disfluency_constituency_parser-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1fbfb439f0f3c7b77ca6a8fa18ff0c5a181075b93ef25df4ce6b542055d0c6a5
MD5 8f260648d12106b7048f2c96a7d867e4
BLAKE2b-256 1849fea190367a4836893cc389f7c35d6d17593359c026ab3a79c646813eda34

See more details on using hashes here.

File details

Details for the file disfluency_constituency_parser-0.0.4-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for disfluency_constituency_parser-0.0.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 86f06f223c49cbec5f26026b335b72da17a447c8de899ebd4f1733df54989d59
MD5 6c5da43134896d063c93f1a9d78ee9e1
BLAKE2b-256 3b9c079dc8ca2809269b2dc8ac59cfa78a782c9e4032ecaf91c147b9a7eeb313

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page