Skip to main content

Rust speech gate with Python bindings

Project description

speechgate-rs

Rust implementation of the FASR energy speech gate with Python bindings.

Install

Install the package from PyPI after a release is published:

pip install speechgate-rs

Install the latest version directly from GitHub:

pip install "git+https://github.com/di-osc/speechgate-rs.git"

For local development, install an editable release build with maturin:

uv sync
env -u CONDA_PREFIX VIRTUAL_ENV=.venv maturin develop --release

Usage

import numpy as np
from speechgate_rs import EnergySpeechGate

audio = np.zeros(16000, dtype=np.float32)
gate = EnergySpeechGate(base_thresh=0.008, max_thresh=0.035)
mask = gate.compute_keep_mask(audio, sample_rate=16000)
gated = gate.apply_array(audio, sample_rate=16000)

For streaming audio, keep one stateful gate per stream:

from speechgate_rs import StreamingEnergySpeechGate

stream_gate = StreamingEnergySpeechGate(stream_context_ms=3000, fade_ms=5)
for chunk in chunks:
    gated_chunk = stream_gate.process_chunk(
        chunk,
        sample_rate=16000,
        is_last=False,
    )

For realtime services that process interleaved streams, use the multi-stream wrapper:

from speechgate_rs import MultiStreamEnergySpeechGate

gate = MultiStreamEnergySpeechGate()
gated_chunk = gate.process_chunk(
    "session-1",
    chunk,
    sample_rate=16000,
    is_last=False,
)

The binding keeps the same energy-gate semantics as the Python EnergySpeechGate implementation in fasr-service-realtime: adaptive RMS thresholding, short voice-burst removal, short silence-gap filling, padding, silence pass windows, streaming context, cross-chunk fade continuity, and fade envelopes.

Parameters

Parameter Default Suggested range Meaning Increase / decrease effect
enabled True True or False Enables the gate. When False, apply_array returns the input audio unchanged. Turn on to filter silence/noise; turn off to bypass the gate completely.
window_ms 10 5-30 ms Analysis window length. RMS energy is computed once per window. Larger is steadier but slower to react; smaller reacts faster but is more sensitive to clicks and short spikes.
base_thresh 0.008 0.001-0.03 RMS Minimum RMS threshold. The adaptive threshold will never go below this value. Larger rejects more quiet speech/noise; smaller keeps softer speech but may pass more background noise.
threshold_ratio 2.0 1.0-5.0 Multiplier applied to the estimated noise floor before clamping. Larger makes the gate stricter in noisy audio; smaller opens the gate more easily.
max_thresh 0.035 0.01-0.1 RMS Maximum RMS threshold. The adaptive threshold will never go above this value. Larger allows the adaptive threshold to become stricter in loud noise; smaller protects quieter speech from being rejected.
smooth_alpha 0.2 0.01-1.0 Exponential smoothing factor for per-window RMS values. Larger follows energy changes faster but may flicker; smaller is steadier but can lag at speech boundaries.
min_voice_windows 5 1-20 windows Minimum consecutive voice windows required to keep a speech region. Larger removes more short bursts but can drop very short words; smaller keeps brief sounds but may pass clicks.
attenuation 0.0 0.0-1.0 gain Gain applied to rejected audio. 0.0 fully mutes it. Larger keeps more background ambience; smaller makes rejected regions quieter.
noise_floor_percentile 20.0 1.0-50.0 Percentile of smoothed RMS values used as the adaptive noise floor estimate. Larger estimates a higher noise floor and becomes stricter; smaller estimates quieter background and opens more easily.
max_silence_gap_windows 8 0-30 windows Maximum silent gap to fill between two voice regions. Larger preserves pauses inside speech but may keep noise between phrases; smaller cuts internal pauses more aggressively.
fade_ms 5 0-50 ms Fade length when switching between kept and rejected audio. Larger makes transitions smoother but may smear boundaries; smaller is tighter but can click or sound abrupt.
stream_context_ms 3000 0-10000 ms Context duration used by streaming integrations to preserve recent audio history. Stateless array APIs keep it for config parity. Larger gives streaming code more history but uses more memory; smaller is lighter but has less context.
pad_voice_windows 2 0-20 windows Windows added before and after detected voice regions. Larger protects speech starts/ends but keeps more surrounding noise; smaller trims tighter but can clip onsets or offsets.
pass_windows 0 0-20 windows Non-voice windows kept after a voice region as a trailing hold. Larger makes streaming output less abrupt; smaller removes trailing silence sooner.

Verify

For performance-sensitive checks, build the native extension in release mode before running tests:

env -u CONDA_PREFIX VIRTUAL_ENV=.venv maturin develop --release
uv run pytest tests -q

The test suite includes a NumPy reference implementation and verifies that the Rust binding returns identical masks/gated output/compacted output while running faster than the NumPy reference on the benchmark audio.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechgate_rs-0.1.3.tar.gz (34.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

speechgate_rs-0.1.3-cp312-cp312-win_amd64.whl (184.8 kB view details)

Uploaded CPython 3.12Windows x86-64

speechgate_rs-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (334.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

speechgate_rs-0.1.3-cp312-cp312-macosx_11_0_arm64.whl (292.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

speechgate_rs-0.1.3-cp311-cp311-win_amd64.whl (186.2 kB view details)

Uploaded CPython 3.11Windows x86-64

speechgate_rs-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (337.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

speechgate_rs-0.1.3-cp311-cp311-macosx_11_0_arm64.whl (294.7 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

speechgate_rs-0.1.3-cp310-cp310-win_amd64.whl (186.0 kB view details)

Uploaded CPython 3.10Windows x86-64

speechgate_rs-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (337.4 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

speechgate_rs-0.1.3-cp310-cp310-macosx_11_0_arm64.whl (294.8 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file speechgate_rs-0.1.3.tar.gz.

File metadata

  • Download URL: speechgate_rs-0.1.3.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for speechgate_rs-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e972672475d3d79130d2887cfa66d210d6fbd1f72ae5c46367ad3179ec22519e
MD5 b61bfd2a7a9aa214ed472a7ad7cae684
BLAKE2b-256 1a445847a6b48804af74ca7aa2ffe5c8f08984c580329745ec8a11fd982ca969

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d14858a017ba3a48f9b335cede9d0d9c146c2544b166bea0a3dc173bf42a5353
MD5 63e78a300c86f75271504edbf4893d12
BLAKE2b-256 4b2b029469205649939484b8daed58bad22ae261fee4843e26de5792fa6c8a0a

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c18a2e1c91b9dae9a18fd39cd9ffa28245e46c89fe080b4508c225cf9810ee85
MD5 c7f359d8a0ad3e575ac159587feb9169
BLAKE2b-256 5d2fab95d3501c95f219d247a7d3f942fa16b8396cd08186dca68a0f1df3e973

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 92e11a234092be838bbf4fa68b9cba308b8a6a839059199f90be704c35c92938
MD5 1af5c1b0b37fce0824f351e18eb8c183
BLAKE2b-256 daf60f598778ae64b38f07058477000bd404e1f51753ba4ff320b0660907e34e

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 4e6f21e384978240807909d9e9e75a9ac72d78e605d12343eb543cee2b382acf
MD5 fa9f38e8cc8d3bebf76157b7422cee1d
BLAKE2b-256 b0a00676cc37d0ae6da1ae3b193fff9d27e03246c61b97ce96f1f8a085c976c1

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a717f14a536c7e430b1fa215d0b84649fc5e2d07d812c7f326bc1b4db82a7e44
MD5 c0ed1729c5883f2bd0dd6bdbfdf7333f
BLAKE2b-256 085f9c5fabff450d8c67aba760c7466b797e9b6c9af96e58ea2b235310ade6d8

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e7a14f80ef5d87ce4df3896c07f91c69fc4a54d70d97196819e05a86e1445bf2
MD5 b252424b85aefd7252564dc81b78a2d3
BLAKE2b-256 471517c8a05a8f0641a487fcbd62d96047339595a35bfdebab8bdf5a247c7e53

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 52590eabbaf93732e0dae58e2b6e993a95044bac3c7ceb1c8fb0761efa96a002
MD5 9c499694352d33c7d98896709a8c37dc
BLAKE2b-256 4adf55ecddbb90aa044541811133a6a54abafd74f6d2e9af18102f796449b3de

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0bee55ee2f4dc96938e9f4bc38030517c47ee295b272177d909043370ef56905
MD5 650fd8a924a6febc990b165227483208
BLAKE2b-256 6a00a56d2f7d9cb80dfda023e565a9bde0e599d594d45bca90d143ecfb1ec358

See more details on using hashes here.

File details

Details for the file speechgate_rs-0.1.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for speechgate_rs-0.1.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 32922e9f837ef07ecc5453174824b06c0f9282627002f316b5e842dac47e301a
MD5 fc6754fc40a863ed582c92e689900779
BLAKE2b-256 9f57519b2e0c5169ba61d56c3c9f2813394da0d3b69c27b9b7e412cf5691df74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page