A pip installable version of jcvazquezc's DisVoice library
Project description
DisVoice
!
DisVoice is a python framework designed to compute features from speech files. Disvoice computes glottal, phonation, articulation, prosody, and phonological-based features both from sustained vowels and continuous speech utterances with the aim to recognize praliguistic aspects from speech. The features can be used in classifiers to recognize emotions, or communication capabilities of patients with different speech disorders including diseases with functional origin such as larinx cancer or nodules; craneo-facial based disorders such as hipernasality developed by cleft-lip and palate; or neurodegenerative disorders such as Parkinson's or Hungtinton's diseases.
For additional information please see glottal, phonation, articulaton, prosody, and phonological directories.
Install
To install the requeriments, please run
install.sh
For Kaldi output Kaldi must be installed beforehand
Reference
If you use Disvoice for research purposes, please cite the following papers, depending on the features you use:
glottal features
[1] Belalcázar-Bolaños, E. A., Orozco-Arroyave, J. R., Vargas-Bonilla, J. F., Haderlein, T., & Nöth, E. (2016, September). Glottal Flow Patterns Analyses for Parkinson’s Disease Detection: Acoustic and Nonlinear Approaches. In International Conference on Text, Speech, and Dialogue (pp. 400-407). Springer.
phonation features
[1] T. Arias-Vergara, J. C. Vásquez-Correa, J. R. Orozco-Arroyave, Parkinson's Disease and Aging: Analysis of Their Effect in Phonation and Articulation of Speech, Cognitive computation, (2017).
[2] Vásquez-Correa, J. C., et al. "Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease." Journal of communication disorders 76 (2018): 21-36.
articulation features
[1] Vásquez-Correa, J. C., et al. "Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease." Journal of communication disorders 76 (2018): 21-36.
[2]. J. R. Orozco-Arroyave, J. C. Vásquez-Correa et al. "NeuroSpeech: An open-source software for Parkinson's speech analysis." Digital Signal Processing (2017).
prosody features
[1]. N., Dehak, P. Dumouchel, and P. Kenny. "Modeling prosodic features with joint factor analysis for speaker verification." IEEE Transactions on Audio, Speech, and Language Processing 15.7 (2007): 2095-2103.
[2] Vásquez-Correa, J. C., et al. "Towards an automatic evaluation of the dysarthria level of patients with Parkinson's disease." Journal of communication disorders 76 (2018): 21-36.
phonological features
[1] Vásquez-Correa, J. C., Klumpp, P., Orozco-Arroyave, J. R., & Nöth, E. (2019). Phonet: a Tool Based on Gated Recurrent Neural Networks to Extract Phonological Posteriors from Speech. Proc. Interspeech 2019, 549-553.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file disvoice-lurein-0.0.1.tar.gz
.
File metadata
- Download URL: disvoice-lurein-0.0.1.tar.gz
- Upload date:
- Size: 35.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb0e8a063b105bb40f5ed8c743e400470ad279ead67ee78cb6d0b625e3df4309 |
|
MD5 | c77a71243d4c0a699f4c60d51e219561 |
|
BLAKE2b-256 | eabd9b7a0abd7ca1b08ff00a7f43b1e1ffa1d06e6125ac546df629f56403fa8c |
Provenance
File details
Details for the file disvoice_lurein-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: disvoice_lurein-0.0.1-py3-none-any.whl
- Upload date:
- Size: 39.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb756bdf39e5e6ff5f7a4ff5c0ebfbcf1be33fcde43ac844db6532a1a121217b |
|
MD5 | 49f24e44b193d4b8588f4863e4b71b31 |
|
BLAKE2b-256 | 4baa8e5f8788bff4dd37044354a84a1d20d5385cfd373dae7468b1f863fab49c |