Skip to main content

Silero plugin for GetStream

Project description

Silero Voice Activity Detection Plugin

A fast and accurate Voice Activity Detection (VAD) plugin for GetStream that uses the Silero VAD model.

Installation

pip install getstream-plugins-silero

Usage

from getstream.plugins.silero import SileroVAD
from getstream.video.rtc.track_util import PcmData

# Initialize with default settings
vad = SileroVAD()

# Or customize parameters
vad = SileroVAD(
    sample_rate=16000,
    frame_size=512,
    silence_threshold=0.3,
    speech_pad_ms=300,
    min_speech_ms=250,
    max_speech_ms=60000,
)

# Register event handlers
@vad.on("audio")
async def on_audio(pcm_data, user):
    print(f"Detected speech: {pcm_data.duration:.2f} seconds")
    # Process the detected speech with an STT engine
    # await stt.process_audio(pcm_data)

# Process incoming audio
incoming_audio = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await vad.process_audio(incoming_audio)

# Reset state if needed
await vad.reset()

Configuration Options

  • sample_rate: Audio sample rate in Hz (default: 16000)
  • frame_size: Size of audio frames to process (default: 512)
  • silence_threshold: Threshold for detecting silence (0.0 to 1.0) (default: 0.5)
  • speech_pad_ms: Number of milliseconds to pad before/after speech (default: 300)
  • min_speech_ms: Minimum milliseconds of speech to emit (default: 250)
  • max_speech_ms: Maximum milliseconds of speech before forced flush (default: 30000)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_silero-0.1.3.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_silero-0.1.3-py3-none-any.whl (21.9 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_silero-0.1.3.tar.gz.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.3.tar.gz
Algorithm Hash digest
SHA256 00e5a22427c93f720c201c85754b6ffba78a632b253a43856bddcb91389f9d5a
MD5 04f8ab4f20076359403a517988b5b0f5
BLAKE2b-256 98437f52a9adff047216d7023f6a685ab57fcd26a09c7eef342e78bcf6b5a00a

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_silero-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c1f3c5e73cfe66a8481a36f2c7c34454b92d13a361c684c7cf12c548ca742a66
MD5 d6898b573d1197ba86fee1e2f1ecb756
BLAKE2b-256 8ac93184e35ee399e7d7288e11f02b817edfd916752a6ee17fdcfef4a7aebc48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page