Toolkit to standardize voice disorder diagnostic labels
Project description
DiVR (Disordered Voice Recognition) - Diagnosis
This repository contains work used to standardize diagnosis to different classification systems found across the literatue, including a new classification system constructed at USVAC.
Installation
pip install divr-diagnosis
How to use
from divr_diagnosis import diagnosis_maps
# Select a diagnosis map (options in `divr_diagnosis/diagnosis_maps` director)
diagnosis_map = diagnosis_maps.USVAC_2025()
# get a specific diagnosis
diagnosis = diagnosis_map.get("laryngeal_tuberculosis")
# also supported dictionary syntax
diagnosis = diagnosis_map["laryngeal_tuberculosis"]
assert "laryngeal_tuberculosis" in diagnosis_map
# allow fetching by aliases
assert "laryngeal tuberculosis" in diagnosis_map
# check if diagnosis is of a type
assert diagnosis.satisfies("pathological") == True
assert diagnosis.satisfies("normal") == False
# get diagnosis parents
assert diagnosis.at_level(3) == "organic_inflammatory_infective"
assert diagnosis.at_level(2) == "organic_inflammatory"
assert diagnosis.at_level(1) == "organic"
assert diagnosis.at_level(0) == "pathological"
assert diagnosis.root == "pathological"
# check if a diagnosis was not classified
diag_inc = diagnosis_map.get("internal_weakness")
assert diag_inc.incompletely_classified == True
# compare consensus in diagnosis. Here laryngeal_tuberculosis has more consensus than intubation_granuloma
diag_1 = diagnosis_map.get("intubation_granuloma")
diag_2 = diagnosis_map.get("laryngeal_tuberculosis")
assert diag_1 < diag_2
# For mapping any given diagnosis to a single parent we use the classification that had the max vote
diag_dissensus = diagnosis_map.get("intubation_granuloma")
assert diag_dissensus.best_parent_link.parent.name == "organic_trauma_internal"
# Get all possible parents of a class, with their vote percentage
diag_dissensus = diagnosis_map.get("intubation_granuloma")
expected_parents = [
"organic_inflammatory_non_infective",
"organic_structural_structural_abnormality",
"organic_trauma_internal"
]
expected_votes = [0.29, 0.29, 0.43]
for parent_link in diag_dissensus.parents:
assert parent_link.parent.name in expected_parents
assert parent_link.weight in expected_votes
# Get all votes of different clinicians
diag_dissensus = diagnosis_map.get("intubation_granuloma")
assert diag_dissensus.votes["clinician 1"] == "organic > trauma > internal"
assert diag_dissensus.votes["clinician 2"] == "organic > trauma > internal"
assert diag_dissensus.votes["clinician 3"] == "organic > trauma > internal"
assert diag_dissensus.votes["clinician 4"] == "organic > inflammatory > non_infective"
assert diag_dissensus.votes["clinician 5"] == "organic > inflammatory > non_infective"
assert diag_dissensus.votes["clinician 6"] == "organic > structural > structural_abnormality"
assert diag_dissensus.votes["clinician 7"] == "organic > structural > structural_abnormality"
# List all pathologies under a parent
expected_diags = [
"arytenoid_dislocation",
"laryngeal_trauma",
"laryngeal_trauma_blunt",
"organic_trauma_external",
]
for diag in diagnosis_map.find(name="organic_trauma_external"):
assert diag.name in expected_diags
How was this created
Databases used
AVFAD [1] , MEEI [2], SVD [3], Torgo [4], UASpeech [5], Uncommon Voice [6], Voiced [7]
USVAC 2025
Classification labels from all the databases mentioned above were extracted, translated to english where needed (e.g. german to english for SVD), and de-duplicated. These labels were then presented to 7 clinicians (2 otorhinolaryngologists, 5 speech pathologists) at the University of Sydney Voice Activity Clinic (USVAC) who classified the labels into a classification system [8] in Qualtrics. The votes were then extracted from Qualtrics in a spreadsheet, which was processed into a classification map.
Other systems
Research on multi-class classification system was identified as part of a scoping review [9]. While a lot of research used a subset of data available to them, or used exact diagnostic labels to classify data, some research grouped certain classes together. We extracted the classification systems used by these papers and implemented them in this module. After the classification system was initially implemented, we went over all the classes available in databases listed above and ensured that all those classes were allocated to one of the classes in the system. Since, we can not make assumptions of how the clinicians of the input research would have mapped the unmapped labels, only labels that closely matched any existing label were assigned as such. All other labels were marked as unclassified.
How to cite
Coming soon
References
[1] L. M. T. Jesus, I. Belo, J. Machado, and A. Hall, “The Advanced Voice Function Assessment Databases (AVFAD): Tools for Voice Clinicians and Speech Research,” in Advances in Speech-language Pathology, F. D. M. Fernandes, Ed., InTech, 2017. doi: 10.5772/intechopen.69643.
[2] Massachusetts Eye and Ear Infirmary, “Voice disorders database, version. 1.03 (cd-rom).” Lincoln Park, NJ: Kay Elemetrics Corporation.
[3] B. Woldert-Jokisz, “Saarbruecken voice database.” 2007.
[4] F. Rudzicz, A. K. Namasivayam, and T. Wolff, “The TORGO database of acoustic and articulatory speech from speakers with dysarthria,” Lang Resources & Evaluation, vol. 46, no. 4, pp. 523–541, Dec. 2012, doi: 10.1007/s10579-011-9145-0.
[5] H. K. Kim et al., “UASpeech.” IEEE DataPort. doi: 10.21227/F9TC-AB45.
[6] M. Moore, P. Papreja, M. Saxon, V. Berisha, and S. Panchanathan, “UncommonVoice: A Crowdsourced Dataset of Dysphonic Speech,” in Interspeech 2020, ISCA, Oct. 2020, pp. 2532–2536. doi: 10.21437/Interspeech.2020-3093.
[7] U. Cesari, G. De Pietro, E. Marciano, C. Niri, G. Sannino, and L. Verde, “A new database of healthy and pathological voices,” Computers & Electrical Engineering, vol. 68, pp. 310–321, May 2018, doi: 10.1016/j.compeleceng.2018.04.008.
[8] C. L. Payten, G. Chiapello, K. A. Weir, and C. J. Madill, “Frameworks, Terminology and Definitions Used for the Classification of Voice Disorders: A Scoping Review.,” J Voice, no. bu2, 8712262, 2022, doi: 10.1016/j.jvoice.2022.02.009.
[9] R. Gupta, D. R. Gunjawate, D. D. Nguyen, C. Jin, and C. Madill, “Voice disorder recognition using machine learning: a scoping review protocol,” BMJ Open, vol. 14, no. 2, 2024, doi: 10.1136/bmjopen-2023-076998.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file divr_diagnosis-0.1.2.tar.gz.
File metadata
- Download URL: divr_diagnosis-0.1.2.tar.gz
- Upload date:
- Size: 29.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1082864a2441f0a64da7b9512f254ecadf34cbc094a6e1b6c702542f501c035f
|
|
| MD5 |
347889b1dd42c152fc508634899909cd
|
|
| BLAKE2b-256 |
d3f10d2dd1fecdacf485f2a6782840ded3cc7f9b49fdf41408c59ff60c6e3177
|
Provenance
The following attestation bundles were made for divr_diagnosis-0.1.2.tar.gz:
Publisher:
release.yml on ComputationalAudioResearchLab/divr-diagnosis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
divr_diagnosis-0.1.2.tar.gz -
Subject digest:
1082864a2441f0a64da7b9512f254ecadf34cbc094a6e1b6c702542f501c035f - Sigstore transparency entry: 183009529
- Sigstore integration time:
-
Permalink:
ComputationalAudioResearchLab/divr-diagnosis@ce70f5f973fc410c5edd239a56785f4e296dfb62 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/ComputationalAudioResearchLab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ce70f5f973fc410c5edd239a56785f4e296dfb62 -
Trigger Event:
release
-
Statement type:
File details
Details for the file divr_diagnosis-0.1.2-py3-none-any.whl.
File metadata
- Download URL: divr_diagnosis-0.1.2-py3-none-any.whl
- Upload date:
- Size: 32.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a0d6ef9ede3a03f6615d2eda2d8288dc7d601bd41bc8279c2e17ed4f96f7375
|
|
| MD5 |
681eef6c8fa3aa7b42bff1ba396a6da3
|
|
| BLAKE2b-256 |
30a5ff45a4667b015fb6b8eebdfd78efad245daeb42a457367227018a52a0588
|
Provenance
The following attestation bundles were made for divr_diagnosis-0.1.2-py3-none-any.whl:
Publisher:
release.yml on ComputationalAudioResearchLab/divr-diagnosis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
divr_diagnosis-0.1.2-py3-none-any.whl -
Subject digest:
4a0d6ef9ede3a03f6615d2eda2d8288dc7d601bd41bc8279c2e17ed4f96f7375 - Sigstore transparency entry: 183009530
- Sigstore integration time:
-
Permalink:
ComputationalAudioResearchLab/divr-diagnosis@ce70f5f973fc410c5edd239a56785f4e296dfb62 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/ComputationalAudioResearchLab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ce70f5f973fc410c5edd239a56785f4e296dfb62 -
Trigger Event:
release
-
Statement type: