Skip to main content

A pip installable version of the phonological function from jcvazquezc's DisVoice library

Project description

Phonological features

Phonological features

phonological.py

Compute phonological features from continuous speech files.

18 descriptors are computed, bases on 18 different phonological classes from the phonet toolkit https://phonet.readthedocs.io/en/latest/?badge=latest

It computes the phonological log-likelihood ratio features from phonet

Static or dynamic matrices can be computed:

Static matrix is formed with 108 features formed with (18 descriptors) x (6 functionals: mean, std, skewness, kurtosis, max, min)

Dynamic matrix is formed with the 18 descriptors computed for frames of 25 ms with a time-shift of 10 ms.

Running

Script is called as follows

python phonological.py <file_or_folder_audio> <file_features> <static (true or false)> <plots (true or false)> <format (csv, txt, npy, kaldi, torch)>

Examples:

Extract features in the command line

python phonological.py "../audios/001_ddk1_PCGITA.wav" "phonologicalfeaturesAst.txt" "true" "true" "txt"
python phonological.py "../audios/001_ddk1_PCGITA.wav" "phonologicalfeaturesUst.csv" "true" "true" "csv"
python phonological.py "../audios/001_ddk1_PCGITA.wav" "phonologicalfeaturesUdyn.pt" "false" "true" "torch"

python phonological.py "../audios/" "phonologicalfeaturesst.txt" "true" "false" "txt"
python phonological.py "../audios/" "phonologicalfeaturesst.csv" "true" "false" "csv"
python phonological.py "../audios/" "phonologicalfeaturesdyn.pt" "false" "false" "torch"
python phonological.py "../audios/" "phonologicalfeaturesdyn.csv" "false" "false" "csv"

KALDI_ROOT=/home/camilo/Camilo/codes/kaldi-master2
export PATH=$PATH:$KALDI_ROOT/src/featbin/
python phonological.py "../audios/001_ddk1_PCGITA.wav" "phonologicalfeaturesddk1dyn" "false" "false" "kaldi"

python phonological.py "../audios/" "phonologicalfeaturesdyn" "false" "false" "kaldi"

Extract features directly in Python

phonological=Phonological()
file_audio="../audios/001_ddk1_PCGITA.wav"
features1=phonological.extract_features_file(file_audio, static=True, plots=True, fmt="npy")
features2=phonological.extract_features_file(file_audio, static=True, plots=True, fmt="dataframe")
features3=phonological.extract_features_file(file_audio, static=False, plots=True, fmt="torch")
phonological.extract_features_file(file_audio, static=False, plots=False, fmt="kaldi", kaldi_file="./test")

path_audio="../audios/"
features1=phonological.extract_features_path(path_audio, static=True, plots=False, fmt="npy")
features2=phonological.extract_features_path(path_audio, static=True, plots=False, fmt="csv")
features3=phonological.extract_features_path(path_audio, static=False, plots=True, fmt="torch")
phonological.extract_features_path(path_audio, static=False, plots=False, fmt="kaldi", kaldi_file="./test.ark")

Jupyter notebook

Results:

Phonological analysis !Image !Image !Image

References

[1] Vásquez-Correa, J. C., Klumpp, P., Orozco-Arroyave, J. R., & Nöth, E. (2019). Phonet: A Tool Based on Gated Recurrent Neural Networks to Extract Phonological Posteriors from Speech. In INTERSPEECH (pp. 549-553).

[2] Diez, M., Varona, A., Penagarikano, M., Rodriguez-Fuentes, L. J., & Bordel, G. (2014). On the projection of PLLRs for unbounded feature distributions in spoken language recognition. IEEE Signal Processing Letters, 21(9), 1073-1077.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

disvoice-phonological-0.0.1.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

disvoice_phonological-0.0.1-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file disvoice-phonological-0.0.1.tar.gz.

File metadata

  • Download URL: disvoice-phonological-0.0.1.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.6

File hashes

Hashes for disvoice-phonological-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f659d48a5211aa491b38ccb6816a841c985ada1b7dff220f4e9bb91bbbad8cbb
MD5 21622d219713a3fef759389c16457002
BLAKE2b-256 57c6421da1166c852cd62f5de3b623fa0b2508b2709a0e768d7010f038661a04

See more details on using hashes here.

Provenance

File details

Details for the file disvoice_phonological-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: disvoice_phonological-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.8.6

File hashes

Hashes for disvoice_phonological-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f2e799bb6e7a99280ab5f3f908439df903e66e5a2ee40b429f7e4ac8d23d63f5
MD5 49865fc5882e729471e62c353302befc
BLAKE2b-256 ee194058669c238bee123a36ab4ab11da3838391b64c7b0ad29fc0df110de01c

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page