4-layer anti-hallucination filter for Whisper speech-to-text output
Project description
whisper-guard
whisper-guard is a small Python package that removes common Whisper hallucinations before you ship subtitles or transcripts downstream.
Problem
Whisper output can degrade on silence-heavy or low-confidence clips:
- repeated phrases
- phantom subtitles on silence
- short character loops such as
哈哈哈哈orxyzxyzxyz
This package extracts the anti-hallucination logic from arkiv into a reusable package with a minimal API.
4-Layer Guard
| Layer | What it does | Default |
|---|---|---|
| L1 Silence | Reject all-silence batches | avg no_speech_prob > 0.6 |
| L2 Segment | Filter weak segments | no_speech_prob > 0.8, avg_logprob < -1.5 (short <1.6s: -1.7), compression_ratio > 3.0 |
| L3 Repetition | Reject repetitive text blocks | unique chunk ratio < 0.35 |
| L4 Char loops | Remove looped patterns | 2-4 chars repeated 3+ times |
A/B Test Results
These numbers are from the arkiv guard benchmark set on April 2026.
| Config | Reps | Reduction | Time |
|---|---|---|---|
| Raw Whisper | 16 | baseline | 47.7s |
| Guard only | 2 | -87.5% | 46.5s |
| LLM polish only | 16 | 0% | 106.1s |
| Guard + LLM | 2 | -87.5% | 119.1s |
| VAD + Guard + LLM | 1 | -93.8% | 128.4s |
Install
pip install whisper-guard
For local development:
pip install -e .
Quick Start
from faster_whisper import WhisperModel
from whisper_guard import WhisperGuard
model = WhisperModel("small")
segments, info = model.transcribe("sample.wav")
guard = WhisperGuard()
result = guard.process([segment._asdict() for segment in segments])
if result.passed:
print(result.text)
API
from whisper_guard import WhisperGuard, GuardConfig, filter_hallucinations
WhisperGuard.process() expects segment dictionaries shaped like:
{
"text": "hello world",
"no_speech_prob": 0.12,
"avg_logprob": -0.44,
"compression_ratio": 1.2,
"start": 0.0, # optional — enables dynamic logprob threshold
"end": 2.5, # optional — enables dynamic logprob threshold
}
When start/end are provided, segments shorter than 1.6s use a stricter logprob threshold (-1.7) to catch hallucinations in brief audio gaps. Segments without timing info fall back to the normal threshold.
Compatible With
faster-whisperopenai-whispermlx-whisper
Optional Vocab Helpers
from whisper_guard.vocab import build_hotwords_prompt, filter_filler_words
Part Of
Built for the arkiv transcription pipeline and split out as a standalone package for reuse.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whisper_guard-0.3.0.tar.gz.
File metadata
- Download URL: whisper_guard-0.3.0.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a43971d40d0bb9d5419c29da475df926b2014e6a5fd5c729dd8f8f45083fc7b
|
|
| MD5 |
e5e52d9b618baf1172a4d53d9d00087a
|
|
| BLAKE2b-256 |
fed77f988773c095ec512fdcc530cd00b8099ea76e825bc1fb5dea5f070ab6a6
|
File details
Details for the file whisper_guard-0.3.0-py3-none-any.whl.
File metadata
- Download URL: whisper_guard-0.3.0-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
670a0ecf10d07abd9f0234762e806c62242cb68655db5e01918a015e96aa9493
|
|
| MD5 |
7f4daf975e5a3dc42529e749f3858387
|
|
| BLAKE2b-256 |
85096c0f67b9d2212c2d4a25d558d9ec7c36d12793c60e7f6da8f3f49dd06da1
|