Skip to main content

A local, persistent Voice Activity Detection (VAD) worker for the cjm-plugin-system that provides high-accuracy speech segmentation using Silero VAD with SQLite result caching.

Project description

cjm-media-plugin-silero-vad

Install

pip install cjm_media_plugin_silero_vad

Project Structure

nbs/
├── meta.ipynb   # Metadata introspection for the Silero VAD plugin used by cjm-ctl to generate the registration manifest.
└── plugin.ipynb # Plugin implementation for Voice Activity Detection using Silero VAD with SQLite result caching.

Total: 2 notebooks

Module Dependencies

graph LR
    meta["meta<br/>Metadata"]
    plugin["plugin<br/>Silero VAD Plugin"]

    plugin --> meta

1 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Metadata (meta.ipynb)

Metadata introspection for the Silero VAD plugin used by cjm-ctl to generate the registration manifest.

Import

from cjm_media_plugin_silero_vad.meta import (
    get_plugin_metadata
)

Functions

def get_plugin_metadata() -> Dict[str, Any]:  # Plugin metadata for manifest generation
    """Return metadata required to register this plugin with the PluginManager."""
    # Fallback base path (current behavior for backward compatibility)
    base_path = os.path.dirname(os.path.dirname(sys.executable))
    
    # Use CJM config if available
    cjm_data_dir = os.environ.get("CJM_DATA_DIR")
    
    # Plugin data directory
    plugin_name = "cjm-media-plugin-silero-vad"
    if cjm_data_dir
    "Return metadata required to register this plugin with the PluginManager."

Silero VAD Plugin (plugin.ipynb)

Plugin implementation for Voice Activity Detection using Silero VAD with SQLite result caching.

Import

from cjm_media_plugin_silero_vad.plugin import (
    SileroVADConfig,
    SileroVADPlugin
)

Classes

@dataclass
class SileroVADConfig:
    "Configuration for Silero VAD parameters."
    
    threshold: float = field(...)
    min_speech_duration_ms: int = field(...)
    min_silence_duration_ms: int = field(...)
    speech_pad_ms: int = field(...)
    sampling_rate: int = field(...)
    use_onnx: bool = field(...)
class SileroVADPlugin:
    def __init__(self):
        """Initialize the Silero VAD plugin."""
        self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
        self.config: SileroVADConfig = None
    "Voice Activity Detection plugin using Silero VAD."
    
    def __init__(self):
            """Initialize the Silero VAD plugin."""
            self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
            self.config: SileroVADConfig = None
        "Initialize the Silero VAD plugin."
    
    def name(self) -> str:  # Plugin name identifier
            """Get the plugin name identifier."""
            return "silero-vad"
        
        @property
        def version(self) -> str:  # Plugin version string
        "Get the plugin name identifier."
    
    def version(self) -> str:  # Plugin version string
            """Get the plugin version string."""
            from cjm_media_plugin_silero_vad import __version__
            return __version__
        
        @property
        def supported_media_types(self) -> List[str]:  # Supported media types
        "Get the plugin version string."
    
    def supported_media_types(self) -> List[str]:  # Supported media types
            """Get the list of supported media types."""
            return ["audio", "video"]
    
        def get_current_config(self) -> Dict[str, Any]:  # Current configuration as dictionary
        "Get the list of supported media types."
    
    def get_current_config(self) -> Dict[str, Any]:  # Current configuration as dictionary
            """Return current configuration state."""
            return config_to_dict(self.config) if self.config else {}
    
        def get_config_schema(self) -> Dict[str, Any]:  # JSON Schema for configuration
        "Return current configuration state."
    
    def get_config_schema(self) -> Dict[str, Any]:  # JSON Schema for configuration
            """Return JSON Schema for UI generation."""
            return dataclass_to_jsonschema(SileroVADConfig)
    
        def _apply_config(
            self,
            config: Optional[Any] = None  # Configuration dataclass, dict, or None
        ) -> None
        "Return JSON Schema for UI generation."
    
    def initialize(
            self,
            config: Optional[Any] = None  # Configuration dataclass, dict, or None
        ) -> None
        "First-time setup. CR-4: config application is factored into _apply_config;
the substrate's reconfigure(old, new) handles deltas - it fires _release_model
on a use_onnx change (RELOAD_TRIGGER) then re-applies config."
    
    def execute(
            self,
            media_path: Union[str, Path],  # Path to media file to analyze
            force: bool = False,           # If True, ignore cache and re-run
            **kwargs                       # Override config parameters for this run
        ) -> MediaAnalysisResult:  # Analysis result with detected speech segments
        "Run VAD on the audio file."
    
    def is_available(self) -> bool:  # True if Silero VAD is available
            """Check if Silero VAD is available."""
            return SILERO_AVAILABLE
    
        def prefetch(self) -> None
        "Check if Silero VAD is available."
    
    def prefetch(self) -> None:
            """CR-4 (SG-19): eagerly load the model so the first execute() doesn't pay
            the load cost. Idempotent via _load_model's None-guard."""
            self._load_model()
    
        def on_disable(self) -> None
        "CR-4 (SG-19): eagerly load the model so the first execute() doesn't pay
the load cost. Idempotent via _load_model's None-guard."
    
    def on_disable(self) -> None:
            """CR-2: release the model when the operator disables the plugin (the worker
            stays alive); the model lazily reloads on the next execute after re-enable."""
            self._release_model()
    
        def cleanup(self) -> None
        "CR-2: release the model when the operator disables the plugin (the worker
stays alive); the model lazily reloads on the next execute after re-enable."
    
    def cleanup(self) -> None
        "Release resources on unload."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_media_plugin_silero_vad-0.0.14.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cjm_media_plugin_silero_vad-0.0.14-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file cjm_media_plugin_silero_vad-0.0.14.tar.gz.

File metadata

File hashes

Hashes for cjm_media_plugin_silero_vad-0.0.14.tar.gz
Algorithm Hash digest
SHA256 441b810328f849f250c740ff62092ee063a71314ac7f1d421d9a643867547808
MD5 a1d279bc4969b5d707b3099e6878eb7a
BLAKE2b-256 e3f2f83a8f46cd26783d5ed0add35f4b346bd5b730d5f550bb873bc3f989b847

See more details on using hashes here.

File details

Details for the file cjm_media_plugin_silero_vad-0.0.14-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_media_plugin_silero_vad-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 f3b782602619a4b356c25b0f92a5896f4907113223eaf1cb7116dc6c5b14600f
MD5 63ec78f5910b1ec75addc8b41f7137eb
BLAKE2b-256 2f1208ecef9d190bad038feb0eed91a26aec0d90214f948a01e7163bf955b0cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page