Skip to main content

Silero plugin for GetStream

Project description

Silero Voice Activity Detection Plugin

A fast and accurate Voice Activity Detection (VAD) plugin for GetStream that uses the Silero VAD model.

Installation

pip install getstream-plugins-silero

Usage

from getstream.plugins.silero import SileroVAD
from getstream.video.rtc.track_util import PcmData

# Initialize with default settings
vad = SileroVAD()

# Or customize parameters
vad = SileroVAD(
    sample_rate=16000,
    frame_size=512,
    silence_threshold=0.3,
    speech_pad_ms=300,
    min_speech_ms=250,
    max_speech_ms=60000,
)

# Register event handlers
@vad.on("audio")
async def on_audio(pcm_data, user):
    print(f"Detected speech: {pcm_data.duration:.2f} seconds")
    # Process the detected speech with an STT engine
    # await stt.process_audio(pcm_data)

# Process incoming audio
incoming_audio = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await vad.process_audio(incoming_audio)

# Reset state if needed
await vad.reset()

Configuration Options

  • sample_rate: Audio sample rate in Hz (default: 16000)
  • frame_size: Size of audio frames to process (default: 512)
  • silence_threshold: Threshold for detecting silence (0.0 to 1.0) (default: 0.5)
  • speech_pad_ms: Number of milliseconds to pad before/after speech (default: 300)
  • min_speech_ms: Minimum milliseconds of speech to emit (default: 250)
  • max_speech_ms: Maximum milliseconds of speech before forced flush (default: 30000)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_silero-0.1.11.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_silero-0.1.11-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_silero-0.1.11.tar.gz.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.11.tar.gz
Algorithm Hash digest
SHA256 a911e61c1bbc5e385f545a9447f8ce09a73c3be38150d9c39f99df55d2fd5451
MD5 873943338cf90ddb4958e6b123450824
BLAKE2b-256 aa25c54b97d92c12a69cc00f55681f40e8553f1a3857c4a72aaa1456c6e58bb9

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_silero-0.1.11-py3-none-any.whl.

File metadata

File hashes

Hashes for vision_agents_plugins_silero-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 d5205a9fcd47dc30a29a54cd457c9ba5f48b73fbeaf53d330ccfc28dd0de952d
MD5 24f1ffed553f3a3bee26f4cca8a9ca31
BLAKE2b-256 166f61ccfa59e1805a7fbd11beefd9b276a606a501ac85bfb61afcc7ef5af764

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page