Skip to main content

A Python package for advanced speech quality assessment using the SCOREQ model

Project description

SCOREQ: Speech Contrastive Regression for Quality Assessment

SCOREQ is a framework for speech quality assessment based on pre-training the encoder with the SCOREQ loss.

This repo provides four speech quality metrics trained with the SCOREQ framework.

Domain Usage Mode Prediction
Natural speech No-reference Mean Opinion Score
Natural speech Non-matching reference, full-reference Euclidean distance clean speech
Synthetic speech No-reference Mean Opinion Score
Synthetic speech Non-matching reference Euclidean distance clean speech

Installation

SCOREQ is hosted on PyPi. It can be installed in your Python environment with the following command

pip install scoreq

The expected sampling rate is 16 kHz. The script automatically resamples audio with different sampling rates. SCOREQ models accept variable input length.

First run

The PyTorch weights are hosted on Zenodo. The first run might be slower due to model download.

Using SCOREQ

SCOREQ can be used in 2 modes and for 2 domains by setting the arguments data_domain and mode.

Using SCOREQ from the command line

Domain Usage Mode CLI
Natural speech No-reference python -m scoreq data_domain natural mode nr /path/to/test_audio
Natural speech Non-matching reference, full-reference python -m scoreq data_domain natural mode ref /path/to/test_audio --ref_path /path/to/ref_audio
Synthetic speech No-reference python -m scoreq data_domain synthetic mode nr /path/to/test_audio
Synthetic speech Non-matching reference python -m scoreq data_domain synthetic mode ref /path/to/test_audio --ref_path /path/to/ref_audio

Using SCOREQ inside Python

Inside python you first need to import the package. Examples using wav files provided in the data directory.

import scoreq

# Predict quality of natural speech in NR mode
nr_scoreq = scoreq.Scoreq(data_domain='natural', mode='nr')
pred_mos = nr_scoreq.predict(test_path='./data/opus.wav', ref_path=None)

# Predict quality of natural speech in REF mode
ref_scoreq = scoreq.Scoreq(data_domain='natural', mode='ref')
pred_distance = ref_scoreq.predict(test_path='./data/opus.wav', ref_path='./data/ref.wav')

# Predict quality of synthetic speech in NR mode
nr_scoreq = scoreq.Scoreq(data_domain='synthetic', mode='nr')
pred_mos = nr_scoreq.predict(test_path='./data/opus.wav', ref_path=None)

# Predict quality of synthetic speech in REF mode
ref_scoreq = scoreq.Scoreq(data_domain='synthetic', mode='ref')
pred_distance = ref_scoreq.predict(test_path='./data/opus.wav', ref_path='./data/ref.wav')

Other

We provide the best model for each domain-mode pair.

Use mode=ref for both non-matching reference or full-reference. This is affected by the clean speech used as input.

If you pass the clean counterpart, the metric will run in full-reference mode. If you pass any clean speech, the metric will run in non-matching reference mode.

Full-reference mode is expected to be used only for natural speech, where the clean copy is available.

SCOREQ learns a distance and it expects clean speech as non-matching reference. The model has not been evaluated for other non-matching references.

Paper (available soon)

The SCOREQ code is licensed under MIT license. Dependencies of the project are available under separate license terms.

Copyright © 2024 Alessandro Ragano

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scoreq-0.0.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scoreq-0.0.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file scoreq-0.0.1.tar.gz.

File metadata

  • Download URL: scoreq-0.0.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.10

File hashes

Hashes for scoreq-0.0.1.tar.gz
Algorithm Hash digest
SHA256 9cee7c4f1c13f5cf3e85fffd19b1c3f4b19fe9726daad4ce9f94a9875bcc8cac
MD5 d9dcaa30333a9a1bf03dbbaa9556eca0
BLAKE2b-256 f85e1c9d363cd2ee58b5f85c7b78968ee84aeec7abfac6724e819c45cf54de83

See more details on using hashes here.

File details

Details for the file scoreq-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: scoreq-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.10

File hashes

Hashes for scoreq-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6de2f3e30e2f5e67e713b541c30c591a25e2b6445c7a5b04a3232a3fedc50ada
MD5 01145030b9ddd0e5f7a32eda448a64b3
BLAKE2b-256 84f47e3a1736b799334455897407360baa935752fe7b5ea33a5b62acf9cd3acd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page