Skip to main content

Saarbrueken Voice Database Downloader and Reader

Project description

PyPI PyPI - Status PyPI - Python Version GitHub

This Python module provides capability to download and organize Saarbrücker Stimmdatenbank (Saarbrücken Voice Database, https://stimmdb.coli.uni-saarland.de/) with SQLAlchemy (sqlalchemy.org).

Features

  • Auto-download the database file at https://stimmdb.coli.uni-saarland.de

  • Auto-download the associated datasets from Zonedo: https://zenodo.org/records/16874898

  • Supports incremental, on-demand download per-pathology

  • Stores database information as a local SQLite3 file

  • Database and datasets are accessed via SQLAlchemy ORM (Object Relational Mapper) classes for ease of use

  • Acoustic and EGG signals can be retrieved as NumPy arrays directly

  • Supports filters to specify study conditions on pathologies, speaker’s gender and age, recording types, etc.

  • Fixes known errors in the dataset (i.e., corrupted files and swapping of acoustic/EGG data)

Install

pip install sbvoicedb

If you prefer manually downloading the full dataset from Zonedo (data.zip, the full dataset, 17.9 GB) you may download the file first and unzip the content to a directory. Make sure that the zip file’s internal structure is preserved. If you’re placing your downloaded database in my_svd folder, its directory structure should appear like this:

.../my_svd/
└── data/
    ├── 1/
    │   ├── sentnces
    │   │   ├── 1-phrase.nsp
    │   │   └── 1-phrase-egg.egg
    │   └── vowels
    │       ├── 1-a_h.nsp
    │       ├── 1-a_h.nsp
    │       ⋮
    │       └── 1-u_n-egg.egg
    ├── 2/
    │   │   ├── 2-phrase.nsp
    │   │   └── 2-phrase-egg.egg
    │   └── vowels
    │       ├── 2-a_h.nsp
    │       ├── 2-a_h.nsp
    │       ⋮
    │       └── 2-u_n-egg.egg
    ⋮

The default my_svd location is OS dependent:

Windows

~\AppData\Local\sbvoicedb\sbvoicedb

Linux

~/.local/share/sbvoicedb

macOS

~/Library/Application Support/sbvoicedb

Tutorial/Examples

from sbvoicedb import SbVoiceDb

# to open the database in the default location
db = SbVoiceDb()

# OR to open the database at a specific data location
dbpath = '<path to the root directory of the extracted database>'
db = SbVoiceDb(dbpath)

This creates a new database instance. If dbpath does not contain the SQLite database file, sbvoice.db, it gets populated from the dataset’s summary CSV file.

If any portion of the dataset is already available in data subdirectory, it further populates the recordings table. These database population processes are visualized with progress bars in the console.

By default, no dataset will be downloaded at this point. You can check how much of the datasets are available by

print(f"{db.number_of_sessions_downloaded}/{db.number_of_all_sessions}")

The db.number_of_all_sessions property should always return 2043.

Your study may only use a subset of the database. In such case, you can set filters on each table when creating the database object. For example, the following creates a subset of the database which only consists of recordings of sustained /a/ or /i/ at normal pitch, uttered by women of age between 50 and 70 with normal voice or with a diagnosis of Laryngitis or Dysphonie:

from sbvoicedb import Pathology, Speaker, RecordingSession, Recording

db_laryngitis = SbVoiceDb(
    dbdir,
    pathology_filter=Pathology.name.in_(["Laryngitis", "Dysphonie"]),
    include_healthy=True,
    speaker_filter=Speaker.gender == "w",
    session_filter=RecordingSession.speaker_age.between(50, 70),
    recording_filter=Recording.utterance.in_(("a_n", "i_n")),
)

The overview of the database can be found in the three SQLite views: recording_session_summary, recording_summary, and pathology_summary. These summary tables can be queried (possibly with additional constraints) and could easily be turned into a Pandas’ DataFrame. To list the recording sessions, use SbVoiceDb.recording_session_summary():

import pandas as pd

print(pd.DataFrame(db_laryngitis.recording_session_summary()))

prints

    speaker_id gender  age  session_id type  nb_recordings
0          142      w   54         495    p             14
1          732      w   63        1057    p             14
2         1358      w   58         449    p             14
3         1360      w   56         451    p             14
4         1379      w   59         568    p             14
...        ...    ...  ...         ...  ...            ...
38        2539      w   68        2405    p             14
39        2582      w   58        2537    p             14
40        2650      w   54        2516    p             14
41        2678      w   50        2544    p             14
42        2728      w   60        2602    p             14
[43 rows x 7 columns]

To list the recordings, use SbVoiceDb.recording_summary():

print(pd.DataFrame(db_laryngitis.recording_summary()))

prints

       id  speaker_id gender  age  session_id type utterance  duration
0    2602         142      w   54         495    p       a_n   1.89166
1    2599         142      w   54         495    p       i_n   1.42548
2    7553         732      w   63        1057    p       a_n   0.75482
3    7552         732      w   63        1057    p       i_n   1.40040
4    2505        1358      w   58         449    p       a_n   1.46502
..    ...         ...    ...  ...         ...  ...       ...       ...
81  27241        2650      w   54        2516    p       i_n   1.49020
82  27629        2678      w   50        2544    p       a_n   0.59864
83  27630        2678      w   50        2544    p       i_n   0.65198
84  28383        2728      w   60        2602    p       a_n   1.06466
85  28382        2728      w   60        2602    p       i_n   1.42588

[86 rows x 8 columns]

Finally, to list pathologies, use SbVoiceDb.pathology_summary():

print(pd.DataFrame(db_laryngitis.pathology_summary()))

prints

   id        name  nb_speakers  nb_sessions
0  25   Dysphonie           99          101
1   5  Laryngitis          128          140

In the SQLite database, the actual tables holding data are: pathologies, speakers, recording_sessions, and recordings. The contents of these tables can be accessed by

db.get_pathology_count()
db.get_speaker_count()
db.get_session_count()
db.get_recording_count()

db.iter_pathologies()
db.iter_speakers()
db.iter_sessions()
db.iter_recordings()
print(f"number of pathologies found: {db_laryngitis.get_pathology_count()}")
print(f"number of recording sessions found: {db_laryngitis.get_session_count()}")
print(f"number of unique speakers: {db_laryngitis.get_speaker_count()}")
print(f"number of recordings: {db_laryngitis.get_recording_count()}")
number of pathologies found: 2
number of recording sessions found: 43
number of unique speakers: 42
number of recordings: 86

To navigate through the actual database, you can iterate over the rows of any of the tables:

# iterate over included pathologies
for patho in db_laryngitis.iter_pathologies():
  print(f'{patho.id)}: {patho.name} ({patho.downloaded})'

# iterate over included speakers
for speaker in db_laryngitis.iter_speakers():
  print(f'{speaker.id)}: {speaker.gender}'

# iterate over included recording sessions
for session in db_laryngitis.iter_sessions():
  print(f'{session.id)}: speaker_id={session.speaker_id}, speaker_age={session.speaker_age}, speaker_health={session.type}'

# iterate over included recordings
for rec in db_laryngitis.iter_recordings():
  print(f'{rec.id)}: session_id={rec.session_id}, utterance={rec.utterance}, nspfile={rec.nspfile}, eggfile={rec.eggfile}'

To retrieve the acoustic and egg data, use Recording.nspdata and Recording.eggdata:

from os import path
import numpy as np
from matplotlib import pyplot as plt

rec = next(db_laryngitis.iter_recordings())

t = np.arange(rec.length)/rec.rate

fig, axes = plt.subplots(2, 1, sharex=True)
axes[0].plot(t, rec.nspdata)
axes[0].set_ylabel("acoustic data")
axes[0].set_title(path.relpath(rec.nspfile, db_laryngitis.datadir))
axes[1].plot(t, rec.eggdata)
axes[1].set_ylabel("EGG data")
axes[1].set_title(path.relpath(rec.eggfile, db_laryngitis.datadir))
axes[1].set_xlabel("time (s)")
plt.tight_layout()
plt.show()
Plots of acoustic and audio data

Data Corrections

SVD Dataset has a several recordings that are incorrectly stored. sbvoicedb corrects the following data files during file retrieval. (Downloaded files are not renamed.)

  • 139-xxx, 141-xxx - acoustic & egg swapped

  • 713-i_n/713-iau - corrupted NSP/EGG files

  • 980-iau/980-iau-egg - acoustic and EGG waveforms were flipflopped at n = 583414

  • 980-phrase/980-phrase-egg - acoustic & Egg files were named backwards

  • 1697-iau/1697-iau-egg - acoustic & Egg files were named backwards

  • 1697-phrase/1697-phrase-egg - acoustic & Egg files were named backwards

Uncorrected entries

  • 1573-a_n, 1573-i_n, 1573-u_n not cut correctly. It is possible to obtain correct intervals from 1573-iau.

License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sbvoicedb-0.6.1.tar.gz (87.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sbvoicedb-0.6.1-py3-none-any.whl (86.2 kB view details)

Uploaded Python 3

File details

Details for the file sbvoicedb-0.6.1.tar.gz.

File metadata

  • Download URL: sbvoicedb-0.6.1.tar.gz
  • Upload date:
  • Size: 87.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sbvoicedb-0.6.1.tar.gz
Algorithm Hash digest
SHA256 d3728b03ada6ba5e253258cabe409908573748f0980189ec0de824acb3c856e9
MD5 32e62b16d795349fb8e1ae106d678355
BLAKE2b-256 59d3d93437fbcd30f8b92783ff36d20d545987f16f9663ea6f3ba559474bbf55

See more details on using hashes here.

Provenance

The following attestation bundles were made for sbvoicedb-0.6.1.tar.gz:

Publisher: pub.yml on tikuma-lsuhsc/python-sbvoicedb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sbvoicedb-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: sbvoicedb-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 86.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sbvoicedb-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2d7c7413d157eaeb090a73f04937d39d5f267c3e7f238dd16a626084de9955f4
MD5 9c2137694fb50c003ccfa99d1ba8d2b2
BLAKE2b-256 c8ea325d8856828dd3e485351f4b57981edfa9afc3329d45ac757f425d7f8811

See more details on using hashes here.

Provenance

The following attestation bundles were made for sbvoicedb-0.6.1-py3-none-any.whl:

Publisher: pub.yml on tikuma-lsuhsc/python-sbvoicedb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page