Robust speech recognition pipeline that prevents audio drops

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

hipsterbrown

These details have not been verified by PyPI

Project description

Hearken

Robust speech recognition pipeline for Python that prevents audio drops during transcription.

The Problem

In typical speech detection programs, audio capture is blocked during transcription. This causes dropped frames when network I/O is slow, resulting in missed speech.

The Solution

Hearken decouples capture, voice activity detection (VAD), and transcription into independent threads with queue-based communication. The capture thread never blocks, preventing audio loss even during slow transcription.

Installation

# Basic installation (includes EnergyVAD)
pip install hearken

# With speech_recognition support
pip install hearken[sr]

# With WebRTC VAD support
pip install hearken[webrtc]

# With Silero VAD support (neural network)
pip install hearken[silero]

# All optional dependencies
pip install hearken[all]

Quick Start

import speech_recognition as sr
from hearken import Listener, EnergyVAD
from hearken.adapters.sr import SpeechRecognitionSource, SRTranscriber

# Setup
recognizer = sr.Recognizer()
mic = sr.Microphone()

with mic as source:
    recognizer.adjust_for_ambient_noise(source)

# Create listener
listener = Listener(
    source=SpeechRecognitionSource(mic),
    transcriber=SRTranscriber(recognizer),
    on_transcript=lambda text, seg: print(f"You said: {text}")
)

# Run
listener.start()
try:
    listener.wait()
except KeyboardInterrupt:
    listener.stop()

Features

No dropped frames: Capture thread never blocks on downstream processing
Two modes: Passive (callbacks) and active (wait_for_speech())
Clean abstractions: Bring your own audio source and transcriber
Production-ready FSM: Robust 4-state detector filters false starts and handles pauses

Voice Activity Detection (VAD)

EnergyVAD: Simple energy-based detection with dynamic threshold calibration
WebRTCVAD: Google WebRTC VAD for improved accuracy in noisy environments
- Requires sample rates: 8000, 16000, 32000, or 48000 Hz
- Configurable aggressiveness (0-3)
- Install with: pip install hearken[webrtc]
SileroVAD: Neural network-based VAD for superior accuracy
- Requires 16kHz audio
- Configurable sensitivity threshold
- Automatic model download and caching
- Install with: pip install hearken[silero]

Architecture

Microphone → [Capture Thread] → Queue → [Detect Thread] → Queue → [Transcribe Thread] → Callback
                   ↓                          ↓                         ↓
            AudioChunk (30ms)      SpeechSegment (complete)    Text transcription

Active Mode

listener = Listener(
    source=SpeechRecognitionSource(mic),
    transcriber=SRTranscriber(recognizer),
)

listener.start()

while True:
    print("Waiting for speech...")
    segment = listener.wait_for_speech()

    if segment:
        try:
            text = listener.transcriber.transcribe(segment)
            print(f"You said: {text}")
        except sr.UnknownValueError:
            print("Could not understand")

Documentation

See examples/ for more usage patterns.

Development

# Clone repository
git clone https://github.com/hipsterbrown/hearken.git
cd hearken

# Install with dev dependencies
uv sync --all-extras

# Run tests
pytest

# Run tests with coverage
pytest --cov=hearken --cov-report=term-missing

# Format code
black hearken/ tests/

# Type checking
mypy hearken/

# Linting
ruff check hearken/ tests/

Roadmap

✅ v0.1: EnergyVAD, core pipeline
✅ v0.2: WebRTC VAD support
✅ v0.3: Silero VAD (neural network)
v0.4: Async transcriber support

License

Apache 2.0

Contributing

Contributions welcome! Please open an issue or PR on GitHub.

Credits

Created by Nick Hehr (@hipsterbrown)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

hipsterbrown

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.0

Dec 16, 2025

This version

0.3.0

Dec 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hearken-0.3.0.tar.gz (137.2 kB view details)

Uploaded Dec 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hearken-0.3.0-py3-none-any.whl (21.4 kB view details)

Uploaded Dec 5, 2025 Python 3

File details

Details for the file hearken-0.3.0.tar.gz.

File metadata

Download URL: hearken-0.3.0.tar.gz
Upload date: Dec 5, 2025
Size: 137.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hearken-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`8c8aae87363c7dc25d85b3edb572481c52519ae764dd57b17b89c2566186657f`
MD5	`5013522ba810c094ecbbeb7840ab2409`
BLAKE2b-256	`f91dce12e17b4ccf7d92eb9f9cfd7bd58213b695d558a104f703645d9d211cac`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hearken-0.3.0.tar.gz:

Publisher: publish.yml on HipsterBrown/hearken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hearken-0.3.0.tar.gz
- Subject digest: 8c8aae87363c7dc25d85b3edb572481c52519ae764dd57b17b89c2566186657f
- Sigstore transparency entry: 743469630
- Sigstore integration time: Dec 5, 2025
Source repository:
- Permalink: HipsterBrown/hearken@b26afa97e69e9ab277ec68c4f0babd3346e50656
- Branch / Tag: refs/heads/main
- Owner: https://github.com/HipsterBrown
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b26afa97e69e9ab277ec68c4f0babd3346e50656
- Trigger Event: workflow_dispatch

File details

Details for the file hearken-0.3.0-py3-none-any.whl.

File metadata

Download URL: hearken-0.3.0-py3-none-any.whl
Upload date: Dec 5, 2025
Size: 21.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hearken-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7c40711182a4c211d4fba3bee269e3e2d9c691aaa14e3eef6c417e1a66d3606e`
MD5	`5f988706d829e5c36995d18eeddd9eab`
BLAKE2b-256	`5b5f897933b609cd992df7e87a5cd4791bd54f4295846423d036230f7006157e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hearken-0.3.0-py3-none-any.whl:

Publisher: publish.yml on HipsterBrown/hearken

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hearken-0.3.0-py3-none-any.whl
- Subject digest: 7c40711182a4c211d4fba3bee269e3e2d9c691aaa14e3eef6c417e1a66d3606e
- Sigstore transparency entry: 743469634
- Sigstore integration time: Dec 5, 2025
Source repository:
- Permalink: HipsterBrown/hearken@b26afa97e69e9ab277ec68c4f0babd3346e50656
- Branch / Tag: refs/heads/main
- Owner: https://github.com/HipsterBrown
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b26afa97e69e9ab277ec68c4f0babd3346e50656
- Trigger Event: workflow_dispatch

hearken 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Hearken

The Problem

The Solution

Installation

Quick Start

Features

Voice Activity Detection (VAD)

Architecture

Active Mode

Documentation

Development

Roadmap

License

Contributing

Credits

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance