Skip to main content

Silero plugin for GetStream

Project description

Silero Voice Activity Detection Plugin

A fast and accurate Voice Activity Detection (VAD) plugin for GetStream that uses the Silero VAD model.

Installation

pip install getstream-plugins-silero

Usage

from getstream.plugins.silero import SileroVAD
from getstream.video.rtc.track_util import PcmData

# Initialize with default settings
vad = SileroVAD()

# Or customize parameters
vad = SileroVAD(
    sample_rate=16000,
    frame_size=512,
    silence_threshold=0.3,
    speech_pad_ms=300,
    min_speech_ms=250,
    max_speech_ms=60000,
)

# Register event handlers
@vad.on("audio")
async def on_audio(pcm_data, user):
    print(f"Detected speech: {pcm_data.duration:.2f} seconds")
    # Process the detected speech with an STT engine
    # await stt.process_audio(pcm_data)

# Process incoming audio
incoming_audio = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await vad.process_audio(incoming_audio)

# Reset state if needed
await vad.reset()

Configuration Options

  • sample_rate: Audio sample rate in Hz (default: 16000)
  • frame_size: Size of audio frames to process (default: 512)
  • silence_threshold: Threshold for detecting silence (0.0 to 1.0) (default: 0.5)
  • speech_pad_ms: Number of milliseconds to pad before/after speech (default: 300)
  • min_speech_ms: Minimum milliseconds of speech to emit (default: 250)
  • max_speech_ms: Maximum milliseconds of speech before forced flush (default: 30000)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_silero-0.1.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_silero-0.1.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_silero-0.1.0.tar.gz.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fb465a0247ca68fa4149e2467f5663eb674297466e28c8b5704356babb9a3fe4
MD5 69a735447ed73f5d5aea0d259e075b55
BLAKE2b-256 c0c0d9fd8d4be8db6674087df474de0637d3d16d0142e964b6a6befc42c5e651

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_silero-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ecfbedc39b275dde88d2775654580a7171ceab73e402eafeb4f9f8c835192ad
MD5 246ff719dc983ca4eb53088d17d1ea09
BLAKE2b-256 3b412db7d3c65671eceae3a67bb32205a35fab16cd70ee005564b6b3e9fe6749

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page