Toolkit to work with disordered voice databases
Project description
DiVR (Disordered Voice Recognition) - Benchmark
This repository contains the work that enables working with various disordered voice databases using the divr-diagnosis label standardization toolkit.
Installation
pip install divr-benchmark
How to use
While you can generate your own tasks, we provide a battery of tasks that we have used across a wide range of experiments. You can read more about them in Tasks.
Generating tasks
You can generate new tasks from the databases (AVFAD, MEEI, SVD, Torgo, UASpeech, UncommonVoice, VOICED). Of these SVD, Torgo and VOICED as publicly accessible and scripts can download the data automatically provided the database is still available on the expected URLs.
from divr_diagnosis import diagnosis_maps
from divr_benchmark import Benchmark, Diagnosis
benchmark = Benchmark(
storage_path="/home/user/divr_benchmark/storage",
version="v1",
sample_rate=16000,
)
diag_map = diagnosis_maps.CaRLab_2025()
async def filter_func(database_func: DatabaseFunc):
# You can filter the data by min_tasks, so thate every speaker has at least N audios
# this is called 'task' because in most datasets the audios represent different vocal tasks
db = await database_func(name="svd", min_tasks=None)
diag_level = diag_map.max_diag_level
def filter_unclassified(tasks): # example of filtering tasks by label
# You can also get task.speaker_id which can be used to count
# number of diag/speaker and restrict which diags are used for the dataset
return [task for task in tasks if not task.label.incompletely_classified]
return Dataset(
train=filter_unclassified(db.all_train(level=diag_level)),
val=filter_unclassified(db.all_val(level=diag_level)),
test=filter_unclassified(db.all_test(level=diag_level)),
)
benchmark.generate_task(
filter_func=filter_func,
task_path="/home/user/divr_benchmark/tasks/all",
diagnosis_map=diag_level,
allow_incomplete_classification=False,
)
Using existing tasks
Almost all functions of the library accept a level parameter which decides which level of diagnosis is the operation performed on. These parameters default to the maximum diagnostic level if left as None, i.e. the narrowest diagnosis furthest away from the binary detection.
from divr_diagnosis import diagnosis_maps
from divr_benchmark import Benchmark, Diagnosis
benchmark = Benchmark(
storage_path="/home/user/divr_benchmark/storage",
version="v1",
sample_rate=16000,
)
# The diagnosis map here can be different from the one used for generating the tasks
# the library will automatically map diagnosis which can be mapped to the new map
# automatically, and unmapped items will be left as unclassified
diag_map = diagnosis_maps.CaRLab_2025()
task = benchmark.load_task(
task_path="/home/user/divr_benchmark/tasks/all",
diag_level=None,
diagnosis_map=diag_map,
load_audios=True,
)
# Training at default level of diagnosis
for train_point in task.train:
point_id = train_point.id
audio = train_point.audio
label = task.diag_to_index(
diag=train_point.label,
level=None,
)
# Training at root/0th level of diagnosis. Equivalent to binary detection
for train_point in task.train:
point_id = train_point.id
audio = train_point.audio
label = task.diag_to_index(
diag=train_point.label,
level=0,
)
# Validating
for val_point in task.val:
point_id = val_point.id
audio = val_point.audio
label = task.diag_to_index(
diag=val_point.label,
level=None,
)
# Testing
for test_point in task.test:
point_id = test_point.id
audio = test_point.audio
label = task.diag_to_index(
diag=test_point.label,
level=None,
)
# Class weights for cross entropy loss
class_weights = task.train_class_weights(level=None) # level defaults to max level of label
loss_fn = nn.CrossEntropyLoss(weight=torch.tensor(class_weights))
# Convert predicted index to diagnosis
diagnosis = task.index_to_diag(
index=index,
level=None,
)
print(diagnosis.name)
# Get all unique diagnosis in the data
diagnosis_names = task.unique_diagnosis(level=None)
How to cite
Coming soon
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file divr_benchmark-0.1.2.tar.gz.
File metadata
- Download URL: divr_benchmark-0.1.2.tar.gz
- Upload date:
- Size: 1.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c00863717e153bf7e5bf67dc22404454659f8fe2a2850bef90edbe601543aa3
|
|
| MD5 |
29a6d8f6e84109947629b0ad09659352
|
|
| BLAKE2b-256 |
e65d17941d84c58481538afe0593d69d00d4efbd77a28439368b1c4d389e2060
|
Provenance
The following attestation bundles were made for divr_benchmark-0.1.2.tar.gz:
Publisher:
release.yml on ComputationalAudioResearchLab/divr-benchmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
divr_benchmark-0.1.2.tar.gz -
Subject digest:
0c00863717e153bf7e5bf67dc22404454659f8fe2a2850bef90edbe601543aa3 - Sigstore transparency entry: 183014638
- Sigstore integration time:
-
Permalink:
ComputationalAudioResearchLab/divr-benchmark@4e2badacffa7fe25cc5e7fc5ccaeacf2bf7b4c4d -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/ComputationalAudioResearchLab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4e2badacffa7fe25cc5e7fc5ccaeacf2bf7b4c4d -
Trigger Event:
release
-
Statement type:
File details
Details for the file divr_benchmark-0.1.2-py3-none-any.whl.
File metadata
- Download URL: divr_benchmark-0.1.2-py3-none-any.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7b295005258c707049f9f1b7ce15a75d23640e7838f82ee52fa2d5926e5c493
|
|
| MD5 |
3a1c4333b1e09a68fa539d52e938d57c
|
|
| BLAKE2b-256 |
1b6a84402e2513e9116bafbf0dab33dcfeb238b8b07078a2d1e87b13390d1faf
|
Provenance
The following attestation bundles were made for divr_benchmark-0.1.2-py3-none-any.whl:
Publisher:
release.yml on ComputationalAudioResearchLab/divr-benchmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
divr_benchmark-0.1.2-py3-none-any.whl -
Subject digest:
c7b295005258c707049f9f1b7ce15a75d23640e7838f82ee52fa2d5926e5c493 - Sigstore transparency entry: 183014640
- Sigstore integration time:
-
Permalink:
ComputationalAudioResearchLab/divr-benchmark@4e2badacffa7fe25cc5e7fc5ccaeacf2bf7b4c4d -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/ComputationalAudioResearchLab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4e2badacffa7fe25cc5e7fc5ccaeacf2bf7b4c4d -
Trigger Event:
release
-
Statement type: