Skip to main content

Silero plugin for GetStream

Project description

Silero Voice Activity Detection Plugin

A fast and accurate Voice Activity Detection (VAD) plugin for GetStream that uses the Silero VAD model.

Installation

pip install getstream-plugins-silero

Usage

from getstream.plugins.silero import SileroVAD
from getstream.video.rtc.track_util import PcmData

# Initialize with default settings
vad = SileroVAD()

# Or customize parameters
vad = SileroVAD(
    sample_rate=16000,
    frame_size=512,
    silence_threshold=0.3,
    speech_pad_ms=300,
    min_speech_ms=250,
    max_speech_ms=60000,
)

# Register event handlers
@vad.on("audio")
async def on_audio(pcm_data, user):
    print(f"Detected speech: {pcm_data.duration:.2f} seconds")
    # Process the detected speech with an STT engine
    # await stt.process_audio(pcm_data)

# Process incoming audio
incoming_audio = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await vad.process_audio(incoming_audio)

# Reset state if needed
await vad.reset()

Configuration Options

  • sample_rate: Audio sample rate in Hz (default: 16000)
  • frame_size: Size of audio frames to process (default: 512)
  • silence_threshold: Threshold for detecting silence (0.0 to 1.0) (default: 0.5)
  • speech_pad_ms: Number of milliseconds to pad before/after speech (default: 300)
  • min_speech_ms: Minimum milliseconds of speech to emit (default: 250)
  • max_speech_ms: Maximum milliseconds of speech before forced flush (default: 30000)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_silero-0.1.9.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_silero-0.1.9-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_silero-0.1.9.tar.gz.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.9.tar.gz
Algorithm Hash digest
SHA256 e275b444f9cf1c953ca045e0baeb1afe13e26d1ee3060f791a1a0a7becd70d4d
MD5 6bd2e91a12114e566d99e93d534f8f70
BLAKE2b-256 6d66968e2fdd978bc07eff7480c9ddbd45c7c11e6411d6b7bbae2a0cd757c78f

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_silero-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 0aba6620e469999311a9d83375dec4eac605019242a684b8b9fc62a53e41a562
MD5 74fb3dec6f14d6539095a36e72e30dc6
BLAKE2b-256 728619c4146822cd259a62ec94cd5e8919f18dc471148c935a5a9996cc331ea9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page