VocalID is an open-source Python library for voice authentication using ECAPA-TDNN speaker embeddings. Train, evaluate, and verify speaker identity from audio files or live microphone input to use voice as biometric to unlock apps, applications and systems.

These details have not been verified by PyPI

Project description

VocalID: A Lightweight Voice Authentication Toolkit

VocalID is a practical and lightweight voice authentication library built around ECAPA-TDNN speaker embeddings and a simple classification layer. It lets you train your own voice model, evaluate its performance, and verify identities from recorded or live audio to use voice as a biometric to unlock apps, applications and for privacy. The goal is to make voice verification simple to run, easy to extend, and stable across devices.

Features

ECAPA-TDNN speaker embeddings (speechbrain/spkrec-ecapa-voxceleb)
Easy training workflow for positive (owner) and negative samples
Evaluation with accuracy and a full classification report
File-based verification and optional live microphone verification
Clean CLI for training, testing and verification
Modular, readable codebase
Simple model storage using pickle
Test suite included
Simple FastAPI server for verification

How It Works

1. Audio Processing Audio is loaded or recorded, resampled to the target rate, converted to mono and padded to a minimum length.

2. Embedding Extraction We extract fixed-dimensional embeddings using ECAPA-TDNN. These embeddings capture speaker-specific characteristics.

3. Training A logistic regression classifier is trained on positive and negative embeddings.

4. Verification When verifying a sample:

Extract the embedding
Run it through the model
Get a probability score
Compare with the threshold from config.py

Package Structure

VocalID/
│
├── voice_verifier/
│   ├── trainer.py        # Training, evaluation, saving, loading
│   ├── verifier.py       # Verification from file or tensor
│   ├── embeddings.py     # ECAPA-TDNN embedding extractor
│   ├── audio_utils.py    # Audio loading / microphone recording
│   ├── config.py         # Threshold, sample rate, model config
│   ├── model_store.py    # Pickle storage helpers
│   ├── cli.py            # CLI interface
│
├── tests/                # Pytest suite
├── examples/             # Example scripts
├── requirements.txt
├── api/app.py            # Optional API server example
└── README.md

Installation

For windows users

pip install vocalid

Or install from source:

git clone https://github.com/Khubaib8281/VocalID.git
cd VocalID
pip install -e .

For Linux Users

Also install;

apt-get install -y libportaudio2

Since, soundevice relies on libportaudio2

Dataset Layout

Your dataset should be organized as:

dataset/
│
├── my_voice/            # Positive samples (your voice)
│   sample1.wav
│   sample2.wav
│   ...
│
└── other_voices/        # Negative samples (others)
    voice1.wav
    voice2.wav
    ...

Each sample should ideally be 4–6 seconds with varied tone, distance, and background conditions.

Example Usage (Python)

Train

from vocalid.trainer import VoiceTrainer
import glob

pos_files = glob.glob("dataset/my_voice/*.wav")
neg_files = glob.glob("dataset/other_voices/*.wav")

trainer = VoiceTrainer()
trainer.train(pos_files, neg_files, save_path="my_voice_model.pkl")

Evaluate

trainer.load("my_voice_model.pkl")

test_pos = glob.glob("dataset/my_voice_test/*.wav")
test_neg = glob.glob("dataset/other_voices_test/*.wav")

metrics = trainer.evaluate(test_pos, test_neg)
print("Accuracy:", metrics["accuracy"])
print(metrics["report"])

Verify a file

from vocalid.verifier import VoiceVerifier

verifier = VoiceVerifier("my_voice_model.pkl")
ok, score = verifier.verify_file("verify_samples/unknown.wav")

print(ok, score)

Verify live audio

ok, score = verifier.verify_live(audio_tensor)
print(ok, score)

Live recording only works on systems with a real microphone. It will not run in cloud notebooks.

CLI Usage

Train:

vocalid train --positive my_voice --negative others --output model.pkl

Evaluate:

vocalid evaluate --model model.pkl --positive my_voice --negative others

Verify a file:

vocalid verify sample.wav --model model.pkl

Live verification:

vocalid live --model model.pkl --seconds 4

Use Cases

Personal voice-unlock systems for apps, applications and systems
Lightweight speaker verification
Research in speaker embeddings
Prototyping identity checks
Classroom or research demonstrations
Testing spoofing and adversarial audio

Why This Matters

VocalID helps developers learn how practical speaker verification works without dealing with heavy frameworks. The library focuses on transparency, modularity and simplicity:

Clear separation of embedding extraction and classification
Easy to swap in a different classifier
Works on CPU
No special hardware needed for training

Contributing

Pull requests are welcome. To run tests:

pytest -v

Feel free to open issues for bugs, improvement ideas, or feature requests.

Author

Muhammad Khubaib Ahmad AI/ML Engineer | Data Scientist | Voice Intelligence Researcher

License

MIT License

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Nov 27, 2025

0.1.9

Nov 27, 2025

0.1.8

Nov 26, 2025

0.1.7

Nov 26, 2025

0.1.6

Nov 26, 2025

0.1.5

Nov 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocalid-0.2.0.tar.gz (11.6 kB view details)

Uploaded Nov 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vocalid-0.2.0-py3-none-any.whl (9.2 kB view details)

Uploaded Nov 27, 2025 Python 3

File details

Details for the file vocalid-0.2.0.tar.gz.

File metadata

Download URL: vocalid-0.2.0.tar.gz
Upload date: Nov 27, 2025
Size: 11.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for vocalid-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`a62e96b747a04c61fa84583288cdab6722f97a8a188bd2776cbaebbf24f9fcd3`
MD5	`1666a32fbef7ed8a701421b02e4cd81b`
BLAKE2b-256	`872b6c0020e56bd936fdbdf8228cddf011ce84548516d790d0ca8d6b7bf1219a`

See more details on using hashes here.

File details

Details for the file vocalid-0.2.0-py3-none-any.whl.

File metadata

Download URL: vocalid-0.2.0-py3-none-any.whl
Upload date: Nov 27, 2025
Size: 9.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for vocalid-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3910db9fc613a2b260437dbe8cec7276df6cf369a799d059923a647b47763a18`
MD5	`bd692b22d515110ef11e161af5b0c994`
BLAKE2b-256	`431509e93b2e622029d33b00c496cb5ec253e30b8f879006b24b290361094dcf`

See more details on using hashes here.

vocalid 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

VocalID: A Lightweight Voice Authentication Toolkit

Features

How It Works

Package Structure

Installation

Dataset Layout

Example Usage (Python)

Train

Evaluate

Verify a file

Verify live audio

CLI Usage

Use Cases

Why This Matters

Contributing

Author

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes