ViSQOL - Virtual Speech Quality Objective Listener (Pure Python)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jacobsjiang

These details have not been verified by PyPI

Project links

Original C++

Project description

ViSQOL (Python)

A pure Python implementation of Google's ViSQOL (Virtual Speech Quality Objective Listener) v3.3.3 for objective audio/speech quality assessment.

ViSQOL compares a reference audio signal with a degraded version and outputs a MOS-LQO (Mean Opinion Score - Listening Quality Objective) score on a scale of 1.0 – 5.0.

Features

Two modes: Audio mode (music/general audio at 48 kHz) and Speech mode (speech at 16 kHz)
High accuracy: 11/11 conformance tests pass against the official C++ implementation
- Audio mode: 9/10 tests produce identical MOS scores (diff = 0.000000), 1 test diff = 0.000117
- Speech mode: diff = 0.006715
Pure Python: no C/C++ compilation required
Minimal dependencies: only 4 pip packages (numpy, scipy, soundfile, libsvm-official)
Faster than real-time: Audio RTF ≈ 0.71x, Speech RTF ≈ 0.38x

Installation

pip install visqol-python

Or install from source:

git clone https://github.com/talker93/visqol-python.git
cd visqol-python
pip install -e .

Quick Start

Python API

from visqol import VisqolApi

# Audio mode (default) - for music and general audio
api = VisqolApi()
api.create(mode="audio")
result = api.measure("reference.wav", "degraded.wav")
print(f"MOS-LQO: {result.moslqo:.4f}")

# Speech mode - for speech signals
api = VisqolApi()
api.create(mode="speech")
result = api.measure("ref_speech.wav", "deg_speech.wav")
print(f"MOS-LQO: {result.moslqo:.4f}")

Using NumPy Arrays

import numpy as np
import soundfile as sf
from visqol import VisqolApi

ref, sr = sf.read("reference.wav")
deg, _  = sf.read("degraded.wav")

api = VisqolApi()
api.create(mode="audio")
result = api.measure_from_arrays(ref, deg, sample_rate=sr)
print(f"MOS-LQO: {result.moslqo:.4f}")

Command Line

# Audio mode (default)
python -m visqol -r reference.wav -d degraded.wav

# Speech mode
python -m visqol -r reference.wav -d degraded.wav --speech_mode

# Verbose output (per-patch details)
python -m visqol -r reference.wav -d degraded.wav -v

CLI options:

Flag	Description
`-r`, `--reference`	Path to reference WAV file (required)
`-d`, `--degraded`	Path to degraded WAV file (required)
`--speech_mode`	Use speech mode (16 kHz, polynomial mapping)
`--model`	Custom SVR model file path (audio mode only)
`--search_window`	Search window radius (default: 60)
`--verbose`, `-v`	Show detailed per-patch results

Output

The measure() method returns a SimilarityResult object with:

Field	Description
`moslqo`	MOS-LQO score (1.0 – 5.0)
`vnsim`	Mean NSIM across all patches
`fvnsim`	Per-frequency-band mean NSIM
`fstdnsim`	Per-frequency-band std of NSIM
`fvdegenergy`	Per-frequency-band degraded energy
`patch_sims`	List of per-patch similarity details

Modes

Audio Mode (default)

Target sample rate: 48 kHz
32 Gammatone frequency bands (50 Hz – 15 000 Hz)
Quality mapping: SVR (Support Vector Regression) model
Best for: music, environmental audio, codecs

Speech Mode

Target sample rate: 16 kHz
32 Gammatone frequency bands (50 Hz – 8 000 Hz)
Quality mapping: exponential polynomial fit
VAD (Voice Activity Detection) based patch selection
Best for: speech, VoIP, telephony

Performance

Measured on Apple M-series, Python 3.13:

Mode	Avg RTF	Typical Time
Audio (48 kHz)	0.71x	7 – 12 s per file pair
Speech (16 kHz)	0.38x	~1 s per file pair

RTF (Real-Time Factor) < 1.0 means faster than real-time.

Project Structure

visqol-python/
├── visqol/                    # Main package
│   ├── __init__.py            # Package exports & version
│   ├── api.py                 # Public API (VisqolApi)
│   ├── visqol_manager.py      # Pipeline orchestrator
│   ├── visqol_core.py         # Core algorithm
│   ├── audio_utils.py         # Audio I/O & SPL normalization
│   ├── signal_utils.py        # Envelope, cross-correlation
│   ├── analysis_window.py     # Hann window
│   ├── gammatone.py           # ERB + Gammatone filterbank + spectrogram
│   ├── patch_creator.py       # Patch creation (Image + VAD modes)
│   ├── patch_selector.py      # DP-based optimal patch matching
│   ├── alignment.py           # Global alignment via cross-correlation
│   ├── nsim.py                # NSIM similarity metric
│   ├── quality_mapper.py      # SVR & exponential quality mapping
│   ├── __main__.py            # CLI entry point
│   └── model/                 # Bundled SVR model
│       └── libsvm_nu_svr_model.txt
├── tests/                     # Tests (pytest)
│   ├── conftest.py            # Shared fixtures & CLI options
│   ├── test_quick.py          # Smoke tests (no external data needed)
│   └── test_conformance.py    # Full conformance tests (needs testdata)
├── .github/workflows/
│   ├── ci.yml                 # CI: test on Python 3.9–3.13
│   └── publish.yml            # Auto-publish to PyPI on tag push
├── pyproject.toml             # Package metadata & build config
├── CHANGELOG.md
├── LICENSE
└── README.md

Conformance Test Results

Tested against the official C++ ViSQOL v3.3.3 expected values:

Test Case	Mode	Expected MOS	Python MOS	Δ
strauss_lp35	Audio	1.3889	1.3889	0.000000
steely_lp7	Audio	2.2502	2.2502	0.000000
sopr_256aac	Audio	4.6823	4.6823	0.000000
ravel_128opus	Audio	4.4651	4.4651	0.000000
moonlight_128aac	Audio	4.6843	4.6843	0.000000
harpsichord_96mp3	Audio	4.2237	4.2237	0.000000
guitar_64aac	Audio	4.3497	4.3497	0.000000
glock_48aac	Audio	4.3325	4.3325	0.000000
contrabassoon_24aac	Audio	2.3469	2.3468	0.000117
castanets_identity	Audio	4.7321	4.7321	0.000000
speech_CA01	Speech	3.3745	3.3678	0.006715

References

Google ViSQOL (C++) — the original implementation this project is ported from
Hines, A., Gillen, E., Kelly, D., Skoglund, J., Kokaram, A., & Harte, N. (2015). ViSQOLAudio: An Objective Audio Quality Metric for Low Bitrate Codecs. The Journal of the Acoustical Society of America.
Chinen, M., Lim, F. S., Skoglund, J., Gureev, N., O'Gorman, F., & Hines, A. (2020). ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric. 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX).

License

Apache License 2.0. See LICENSE for details.

This project is a Python port of Google's ViSQOL, which is also licensed under Apache 2.0.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jacobsjiang

These details have not been verified by PyPI

Project links

Original C++

Release history Release notifications | RSS feed

3.4.0

Mar 23, 2026

3.3.6

Mar 23, 2026

This version

3.3.5

Mar 23, 2026

3.3.4

Mar 23, 2026

3.3.3

Mar 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visqol_python-3.3.5.tar.gz (89.3 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

visqol_python-3.3.5-py3-none-any.whl (88.6 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file visqol_python-3.3.5.tar.gz.

File metadata

Download URL: visqol_python-3.3.5.tar.gz
Upload date: Mar 23, 2026
Size: 89.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for visqol_python-3.3.5.tar.gz
Algorithm	Hash digest
SHA256	`755739c6133ce92d50feaa6348424615047ba459a2a2f891d2c2104fa0e9f258`
MD5	`54c02a618db17ab14ed7114ee4f52da5`
BLAKE2b-256	`65332a4aa43d70f21a1c1783b5ec3212b1590694373e140c4ca17f193e1b47b6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for visqol_python-3.3.5.tar.gz:

Publisher: publish.yml on talker93/visqol-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: visqol_python-3.3.5.tar.gz
- Subject digest: 755739c6133ce92d50feaa6348424615047ba459a2a2f891d2c2104fa0e9f258
- Sigstore transparency entry: 1157639778
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: talker93/visqol-python@d7af98294396b8a7fb5fb8a46d0059d69e422055
- Branch / Tag: refs/tags/v3.3.5
- Owner: https://github.com/talker93
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d7af98294396b8a7fb5fb8a46d0059d69e422055
- Trigger Event: push

File details

Details for the file visqol_python-3.3.5-py3-none-any.whl.

File metadata

Download URL: visqol_python-3.3.5-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 88.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for visqol_python-3.3.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cccccc426d460ab4f6f037a9076ed69fdebaee5cadcb1305362a73adbbbe382d`
MD5	`2e4188f0c805efc2abb3b612c76de822`
BLAKE2b-256	`565efe6b7a3221750a869fd714fe07803ef545fe1cea4d19889a97eec8075601`

See more details on using hashes here.

Provenance

The following attestation bundles were made for visqol_python-3.3.5-py3-none-any.whl:

Publisher: publish.yml on talker93/visqol-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: visqol_python-3.3.5-py3-none-any.whl
- Subject digest: cccccc426d460ab4f6f037a9076ed69fdebaee5cadcb1305362a73adbbbe382d
- Sigstore transparency entry: 1157639838
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: talker93/visqol-python@d7af98294396b8a7fb5fb8a46d0059d69e422055
- Branch / Tag: refs/tags/v3.3.5
- Owner: https://github.com/talker93
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d7af98294396b8a7fb5fb8a46d0059d69e422055
- Trigger Event: push

visqol-python 3.3.5

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ViSQOL (Python)

Features

Installation

Quick Start

Python API

Using NumPy Arrays

Command Line

Output

Modes

Audio Mode (default)

Speech Mode

Performance

Project Structure

Conformance Test Results

References

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance