Skip to main content

Rust speech gate with Python bindings

Project description

speechgate-rs

Rust implementation of the FASR energy speech gate with Python bindings.

Install

Install the package from PyPI after a release is published:

pip install speechgate-rs

Install the latest version directly from GitHub:

pip install "git+https://github.com/di-osc/speechgate-rs.git"

For local development, install an editable release build with maturin:

uv sync
env -u CONDA_PREFIX VIRTUAL_ENV=.venv maturin develop --release

Usage

import numpy as np
from speechgate_rs import EnergySpeechGate

audio = np.zeros(16000, dtype=np.float32)
gate = EnergySpeechGate(base_thresh=0.008, max_thresh=0.035)
mask = gate.compute_keep_mask(audio, sample_rate=16000)
gated = gate.apply_array(audio, sample_rate=16000)

For streaming audio, keep one stateful gate per stream:

from speechgate_rs import StreamingEnergySpeechGate

stream_gate = StreamingEnergySpeechGate(stream_context_ms=3000, fade_ms=5)
for chunk in chunks:
    gated_chunk = stream_gate.process_chunk(
        chunk,
        sample_rate=16000,
        is_last=False,
    )

For realtime services that process interleaved streams, use the multi-stream wrapper:

from speechgate_rs import MultiStreamEnergySpeechGate

gate = MultiStreamEnergySpeechGate()
gated_chunk = gate.process_chunk(
    "session-1",
    chunk,
    sample_rate=16000,
    is_last=False,
)

The binding keeps the same energy-gate semantics as the Python EnergySpeechGate implementation in fasr-service-realtime: adaptive RMS thresholding, short voice-burst removal, short silence-gap filling, padding, silence pass windows, streaming context, cross-chunk fade continuity, and fade envelopes.

Parameters

Parameter Default Suggested range Meaning Increase / decrease effect
enabled True True or False Enables the gate. When False, apply_array returns the input audio unchanged. Turn on to filter silence/noise; turn off to bypass the gate completely.
window_ms 10 5-30 ms Analysis window length. RMS energy is computed once per window. Larger is steadier but slower to react; smaller reacts faster but is more sensitive to clicks and short spikes.
base_thresh 0.008 0.001-0.03 RMS Minimum RMS threshold. The adaptive threshold will never go below this value. Larger rejects more quiet speech/noise; smaller keeps softer speech but may pass more background noise.
threshold_ratio 2.0 1.0-5.0 Multiplier applied to the estimated noise floor before clamping. Larger makes the gate stricter in noisy audio; smaller opens the gate more easily.
max_thresh 0.035 0.01-0.1 RMS Maximum RMS threshold. The adaptive threshold will never go above this value. Larger allows the adaptive threshold to become stricter in loud noise; smaller protects quieter speech from being rejected.
smooth_alpha 0.2 0.01-1.0 Exponential smoothing factor for per-window RMS values. Larger follows energy changes faster but may flicker; smaller is steadier but can lag at speech boundaries.
min_voice_windows 5 1-20 windows Minimum consecutive voice windows required to keep a speech region. Larger removes more short bursts but can drop very short words; smaller keeps brief sounds but may pass clicks.
attenuation 0.0 0.0-1.0 gain Gain applied to rejected audio. 0.0 fully mutes it. Larger keeps more background ambience; smaller makes rejected regions quieter.
noise_floor_percentile 20.0 1.0-50.0 Percentile of smoothed RMS values used as the adaptive noise floor estimate. Larger estimates a higher noise floor and becomes stricter; smaller estimates quieter background and opens more easily.
max_silence_gap_windows 8 0-30 windows Maximum silent gap to fill between two voice regions. Larger preserves pauses inside speech but may keep noise between phrases; smaller cuts internal pauses more aggressively.
fade_ms 5 0-50 ms Fade length when switching between kept and rejected audio. Larger makes transitions smoother but may smear boundaries; smaller is tighter but can click or sound abrupt.
stream_context_ms 3000 0-10000 ms Context duration used by streaming integrations to preserve recent audio history. Stateless array APIs keep it for config parity. Larger gives streaming code more history but uses more memory; smaller is lighter but has less context.
pad_voice_windows 2 0-20 windows Windows added before and after detected voice regions. Larger protects speech starts/ends but keeps more surrounding noise; smaller trims tighter but can clip onsets or offsets.
pass_windows 0 0-20 windows Non-voice windows kept after a voice region as a trailing hold. Larger makes streaming output less abrupt; smaller removes trailing silence sooner.

Verify

For performance-sensitive checks, build the native extension in release mode before running tests:

env -u CONDA_PREFIX VIRTUAL_ENV=.venv maturin develop --release
uv run pytest tests -q

The test suite includes a NumPy reference implementation and verifies that the Rust binding returns identical masks/gated output/compacted output while running faster than the NumPy reference on the benchmark audio.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechgate_rs-0.1.2.tar.gz (34.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

speechgate_rs-0.1.2-cp312-cp312-win_amd64.whl (184.8 kB view details)

Uploaded CPython 3.12Windows x86-64

speechgate_rs-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (334.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

speechgate_rs-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (292.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

speechgate_rs-0.1.2-cp311-cp311-win_amd64.whl (186.2 kB view details)

Uploaded CPython 3.11Windows x86-64

speechgate_rs-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (337.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

speechgate_rs-0.1.2-cp311-cp311-macosx_11_0_arm64.whl (294.7 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

speechgate_rs-0.1.2-cp310-cp310-win_amd64.whl (186.0 kB view details)

Uploaded CPython 3.10Windows x86-64

speechgate_rs-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (337.4 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

speechgate_rs-0.1.2-cp310-cp310-macosx_11_0_arm64.whl (294.8 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file speechgate_rs-0.1.2.tar.gz.

File metadata

  • Download URL: speechgate_rs-0.1.2.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for speechgate_rs-0.1.2.tar.gz
Algorithm Hash digest
SHA256 00900e5b05c1d7b929d2f98043728bdb7ca98fa858318c90a7102d7f5161ad1e
MD5 0c3a107c9097178e0d2c6acae50bcca4
BLAKE2b-256 4494f8d877267d5f9115bc9125572b0d828210265cd32561c76a101923275c98

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 538ada94f4cbbba6ec83dafe27abf7a60e45ab57d13b935cd5bca945bb69a1f0
MD5 5f17b9ca165869248fbe96c7fc37b6df
BLAKE2b-256 ba1abc30de0489b15a824c917c79465ee29d6fed92853260b9a22b2bc0dc001d

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1d47d3320b5c87298e2d0c035bc459c13763c272b6905eebd256094f7c64fa61
MD5 18479c5b4736ca4c8e00739a35c04328
BLAKE2b-256 8968e35d81dda83e6e1e31d858615212e26a4fe915456452d9836028246a1813

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e9996da46ba98ad3fbc8edfc77f894a916fb4176a978278380cf76fa50159f55
MD5 a930396cdad49eb9b4b10457801d3e12
BLAKE2b-256 87f862c0aebd910d901da20cdefd9a0a3c7ae3ca8f849a3569465cc528c68a26

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 fa7b771b4a5608693ff8430fcfe3ee709470364704b9771084dea1fb7bd909b4
MD5 f66a34d9a6d91fa5db54c824dad3e755
BLAKE2b-256 e7e90d4f14bf0cd6da0f32c8467876a4652fdc3c4fc5d0fb865b2ac5c66a5b87

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6920a68342f6224ca894fe0f2a84acb5bef1313d9e99e93ad15a4176db0bfbe2
MD5 134c7d0770fa84e056e777b97f1de9bd
BLAKE2b-256 99fa08b269d8d8702039b3311f9a58e03476bce2dbb53fb5873087919dfe3020

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7f5f037c376db4ba633c4c97b988817a2a98fd008817f5b80c885c7d505cde49
MD5 f302177b028227054bfcd71e72bbb50c
BLAKE2b-256 b8901ac7952fd03d052a81f15b0e4a3fae1740c29374bd9bb5b94a6594c98ea4

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f9af44f6af4c71ef1752f95222e17c08f4c44999ba4ea14034f84fce36d6902a
MD5 98c4fb0898fc0fcf383fc58dd4568b8d
BLAKE2b-256 1fd1f778386aacb22e72f9d7e90cbd75cc46c204db34bff77878c1b145c796fc

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 22ecd8a700174dc09beb2b11011f397eb77edd1245d4af7d638d563909690305
MD5 8c3f7b1632545574be32180b94ad0873
BLAKE2b-256 b04dd9b68f975e9573f5256887e516d8ce614ea9386ab6dca9ac89e93522e217

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cce957fddcaddd64ea424b1d0771bd79bdbff74640ea7726c24f9aadd0287cb4
MD5 103ccc219772fb8803f50ffab4abb11d
BLAKE2b-256 cfde063b511517a61a211b6e4c4c05d2700c50b6e43441023b19d55a09edf345

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page