Real-Time Vocal Fatigue Monitoring for Continuous Speech Analytics

Project description

VoiceMonitor

Python PyPI License Downloads Status

Real-Time Vocal Fatigue Monitoring for Continuous Speech Analytics

VoiceMonitor is a Python library for real-time vocal fatigue monitoring built on top of the auralis_vfs vocal fatigue scoring framework. It enables continuous microphone monitoring, fatigue scoring using sliding audio windows, and session-level analytics for voice health monitoring.

The system is designed for researchers, speech engineers, and voice professionals who require automated analysis of vocal strain during prolonged speech activity.

VoiceMonitor processes live microphone input, extracts standardized audio segments, computes fatigue scores, and produces real-time alerts and session analytics.

Overview

Prolonged speaking can lead to vocal fatigue, a condition characterized by strain, reduced vocal efficiency, and potential long-term damage to vocal health.

VoiceMonitor provides a real-time monitoring pipeline that:

captures microphone audio streams
processes sliding audio windows
computes vocal fatigue scores
tracks fatigue progression over time
generates session analytics and warnings

The system leverages the ECAPA-TDNN-VHE vocal fatigue estimation model and the auralis_vfs scoring framework developed as part of ongoing research in speech health monitoring.

Key Features

Real-time microphone audio monitoring
Continuous vocal fatigue scoring
Sliding window fatigue analysis
Fatigue threshold warnings
Session-level analytics and reports
Chunk-based audio processing pipeline
JSON session export for downstream analysis
Lightweight CLI interface for quick experiments

Architecture

VoiceMonitor uses a sliding window inference pipeline for continuous analysis.

Microphone Input
        │
        ▼
Audio Stream Buffer
        │
        ▼
Sliding Window Segmentation (5s)
        │
        ▼
auralis_vfs Preprocessing
        │
        ▼
Vocal Fatigue Scoring
        │
        ▼
Session Analytics Engine
        │
        ▼
Fatigue Alerts + Reports

Each processed window produces a fatigue score, enabling real-time tracking of vocal strain progression during speech sessions.

Installation

Requirements

Python ≥ 3.10
FFmpeg installed on system
Microphone access

Install from PyPI

pip install voicemonitor

Install from source

git clone https://github.com/<your-username>/voicemonitor.git
cd voicemonitor
pip install -e .

Dependencies

VoiceMonitor relies on the following core libraries:

auralis_vfs
numpy
sounddevice
soundfile
pydub
tqdm

FFmpeg must be installed for audio processing.

Quick Start

CLI Usage

Start real-time vocal fatigue monitoring:

voicemonitor

Monitor for a fixed duration:

voicemonitor --duration 120

Set a custom fatigue warning threshold:

voicemonitor --threshold 65

Example output:

[20260312_182001] Score: 22.51
[20260312_182006] Score: 31.02
[20260312_182011] Score: 45.44
[20260312_182016] Score: 72.90

⚠ fatigue threshold crossed

After the session completes, a report is generated:

session_report.json

Python API

VoiceMonitor can also be used directly in Python applications.

from voicemonitor import VoiceMonitor

monitor = VoiceMonitor(threshold=70)

session = monitor.start(duration_sec=120)

session.export_json("session_report.json")

Session Analytics

Each monitoring session records:

average fatigue score
maximum fatigue score
timestamps of processed windows
processed audio chunk paths
fatigue warning events

Example report:

{
  "summary": {
    "average_fatigue": 38.2,
    "max_fatigue": 74.1,
    "readings": 25
  },
  "records": [
    {
      "timestamp": "20260312_182001",
      "chunk": "chunks/20260312_182001.wav",
      "score": 22.51
    }
  ]
}

Configuration

Audio overlap is of 1 sec
default threshold is 70

Research Background

VoiceMonitor is built upon the auralis_vfs vocal fatigue scoring framework, which was developed as part of research on automated vocal fatigue detection.

The underlying fatigue estimation model is based on an ECAPA-TDNN architecture adapted for vocal health estimation.

Research paper:

Modeling Vocal Fatigue as Embedding-Space Deviation Using Contrastively Trained ECAPA-TDNNs

Model repository:

huggingface.co/Khubaib01/ECAPA-TDNN-VHE

The model estimates vocal fatigue levels from short speech segments and provides a continuous fatigue score representing vocal strain.

VoiceMonitor extends this work by enabling real-time fatigue monitoring and session analytics.

Applications

VoiceMonitor can be used in a variety of speech-intensive environments:

speech research
voice health monitoring
call center voice analytics
teacher vocal load monitoring
podcast and streaming voice tracking
speech therapy experiments
human-computer interaction studies

Project Structure

voicemonitor/
├── voicemonitor/
│   ├── audio_stream.py
│   ├── analytics.py
│   ├── session.py
│   ├── utils.py
│   ├── config.py
│   └── cli.py
│
├── examples/
│   └── live.py
│
├── tests/
│   └── test_session.py
│
├── LISENCE
├── setup.cfg
├── requirements.txt
├── pyproject.toml
└── README.md

Future Development

Planned enhancements include:

real-time visualization dashboard
web API for remote monitoring
desktop GUI interface
voice activity detection integration
fatigue trend prediction models
speaker-aware monitoring

Citation

If you use VoiceMonitor in research, please cite the underlying work:

Ahmad, M. K. (2026). Modeling Vocal Fatigue as Embedding-Space Deviation Using Contrastively Trained ECAPA-TDNNs. Zenodo. https://doi.org/10.5281/zenodo.18366305

License

This project is released under the MIT License.

Author

Muhammad Khubaib Ahmad AI / ML Engineer Speech Intelligence and Audio AI Systems

Project details

Release history Release notifications | RSS feed

1.0.0

Mar 13, 2026

This version

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voicemonitor-0.1.0.tar.gz (8.4 kB view details)

Uploaded Mar 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voicemonitor-0.1.0-py3-none-any.whl (7.3 kB view details)

Uploaded Mar 13, 2026 Python 3

File details

Details for the file voicemonitor-0.1.0.tar.gz.

File metadata

Download URL: voicemonitor-0.1.0.tar.gz
Upload date: Mar 13, 2026
Size: 8.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for voicemonitor-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a75170b13de74dbd6a4061f9d9b2be668c2f7879bbfacf8ab7501f871a1102a8`
MD5	`38f9c46c597e9e3062952f0416969185`
BLAKE2b-256	`604e40f1e3ae423bec882bd6e9a3e17a43a001e2484f755462d1532367ef0ccf`

See more details on using hashes here.

File details

Details for the file voicemonitor-0.1.0-py3-none-any.whl.

File metadata

Download URL: voicemonitor-0.1.0-py3-none-any.whl
Upload date: Mar 13, 2026
Size: 7.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for voicemonitor-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cd4a39ee8d815ceb2b1243f33d1c2d340bc9e927886b0e1732fb328fab9697fd`
MD5	`c90798ee5f2a6926bfb7fd3df65fca62`
BLAKE2b-256	`36caac1bb48a5f09cbc730abe347d92b089379edf3ce409ebac786ec65c1032e`

See more details on using hashes here.

voiceMonitor 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

VoiceMonitor

Overview

Key Features

Architecture

Installation

Requirements

Install from PyPI

Install from source

Dependencies

Quick Start

CLI Usage

Python API

Session Analytics

Configuration

Research Background

Applications

Project Structure

Future Development

Citation

License

Author

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes