Skip to main content

OpenAI Whisper plugin for the cjm-transcription-plugin-system library - provides local speech-to-text transcription with configurable model selection and parameter control.

Project description

cjm-transcription-plugin-whisper

Install

pip install cjm_transcription_plugin_whisper

Project Structure

nbs/
├── meta.ipynb   # Metadata introspection for the Whisper plugin used by cjm-ctl to generate the registration manifest.
└── plugin.ipynb # Plugin implementation for OpenAI Whisper transcription

Total: 2 notebooks across 1 directory

Module Dependencies

graph LR
    meta[meta<br/>Metadata]
    plugin[plugin<br/>Whisper Plugin]

    plugin --> meta

1 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Metadata (meta.ipynb)

Metadata introspection for the Whisper plugin used by cjm-ctl to generate the registration manifest.

Import

from cjm_transcription_plugin_whisper.meta import (
    get_plugin_metadata
)

Functions

def get_plugin_metadata() -> Dict[str, Any]: # Plugin metadata for manifest generation
    """Return metadata required to register this plugin with the PluginManager."""
    # Fallback base path (current behavior for backward compatibility)
    base_path = os.path.dirname(os.path.dirname(sys.executable))

    # Use CJM config if available, else fallback to env-relative paths
    cjm_data_dir = os.environ.get("CJM_DATA_DIR")
    cjm_models_dir = os.environ.get("CJM_MODELS_DIR")

    # Plugin data directory
    plugin_name = "cjm-transcription-plugin-whisper"
    if cjm_data_dir
    "Return metadata required to register this plugin with the PluginManager."

Whisper Plugin (plugin.ipynb)

Plugin implementation for OpenAI Whisper transcription

Import

from cjm_transcription_plugin_whisper.plugin import (
    WhisperPluginConfig,
    WhisperLocalPlugin
)

Classes

@dataclass
class WhisperPluginConfig:
    "Configuration for Whisper transcription plugin."
    
    model: str = field(...)
    device: str = field(...)
    language: Optional[str] = field(...)
    task: str = field(...)
    temperature: float = field(...)
    temperature_increment_on_fallback: Optional[float] = field(...)
    beam_size: int = field(...)
    best_of: int = field(...)
    patience: float = field(...)
    length_penalty: Optional[float] = field(...)
    suppress_tokens: str = field(...)
    initial_prompt: Optional[str] = field(...)
    condition_on_previous_text: bool = field(...)
    fp16: bool = field(...)
    compression_ratio_threshold: float = field(...)
    logprob_threshold: float = field(...)
    no_speech_threshold: float = field(...)
    word_timestamps: bool = field(...)
    prepend_punctuations: str = field(...)
    append_punctuations: str = field(...)
    threads: int = field(...)
    model_dir: Optional[str] = field(...)
    compile_model: bool = field(...)
class WhisperLocalPlugin:
    def __init__(self):
        """Initialize the Whisper plugin with default configuration."""
        self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
        self.config: WhisperPluginConfig = None
    "OpenAI Whisper transcription plugin."
    
    def __init__(self):
            """Initialize the Whisper plugin with default configuration."""
            self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
            self.config: WhisperPluginConfig = None
        "Initialize the Whisper plugin with default configuration."
    
    def name(self) -> str: # Plugin name identifier
            """Get the plugin name identifier."""
            return "whisper_local"
        
        @property
        def version(self) -> str: # Plugin version string
        "Get the plugin name identifier."
    
    def version(self) -> str: # Plugin version string
            """Get the plugin version string."""
            return "1.0.0"
        
        @property
        def supported_formats(self) -> List[str]: # List of supported audio file formats
        "Get the plugin version string."
    
    def supported_formats(self) -> List[str]: # List of supported audio file formats
            """Get the list of supported audio file formats."""
            return ["wav", "mp3", "flac", "m4a", "ogg", "webm", "mp4", "avi", "mov"]
    
        def get_current_config(self) -> Dict[str, Any]: # Current configuration as dictionary
        "Get the list of supported audio file formats."
    
    def get_current_config(self) -> Dict[str, Any]: # Current configuration as dictionary
            """Return current configuration state."""
            if not self.config
        "Return current configuration state."
    
    def get_config_schema(self) -> Dict[str, Any]: # JSON Schema for configuration
            """Return JSON Schema for UI generation."""
            return dataclass_to_jsonschema(WhisperPluginConfig)
    
        @staticmethod
        def get_config_dataclass() -> WhisperPluginConfig: # Configuration dataclass
        "Return JSON Schema for UI generation."
    
    def get_config_dataclass() -> WhisperPluginConfig: # Configuration dataclass
            """Return dataclass describing the plugin's configuration options."""
            return WhisperPluginConfig
        
        def initialize(
            self,
            config: Optional[Any] = None # Configuration dataclass, dict, or None
        ) -> None
        "Return dataclass describing the plugin's configuration options."
    
    def initialize(
            self,
            config: Optional[Any] = None # Configuration dataclass, dict, or None
        ) -> None
        "Initialize or re-configure the plugin (idempotent)."
    
    def execute(
            self,
            audio: Union[str, Path], # Path to the audio file to transcribe
            **kwargs # Additional arguments to override config
        ) -> TranscriptionResult: # Transcription result with text and metadata
        "Transcribe audio using Whisper.

`audio` is a path to a decodable audio file; the caller guarantees it is
model-ready (format / sample-rate / channels handled upstream)."
    
    def is_available(self) -> bool: # True if Whisper and its dependencies are available
            """Check if Whisper is available."""
            return WHISPER_AVAILABLE
        
        def cleanup(self) -> None
        "Check if Whisper is available."
    
    def cleanup(self) -> None:
            """Clean up resources."""
            if self.model is not None
        "Clean up resources."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_transcription_plugin_whisper-0.0.23.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cjm_transcription_plugin_whisper-0.0.23.tar.gz.

File metadata

File hashes

Hashes for cjm_transcription_plugin_whisper-0.0.23.tar.gz
Algorithm Hash digest
SHA256 3ecaf9cd71446ed7a02e566cc4ca87f9f183d77b280ffdf7773cf2b476e6e54d
MD5 a9b41d318e817a63264dedeb77655b8a
BLAKE2b-256 4e4d8a6f74097097138a7e67bf17266084c64c55da0a2628fc36a7a59fa49539

See more details on using hashes here.

File details

Details for the file cjm_transcription_plugin_whisper-0.0.23-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcription_plugin_whisper-0.0.23-py3-none-any.whl
Algorithm Hash digest
SHA256 956e37f5a9ff2f759e4030e197b817f6180c8ca925fdb13e4e80e3d62242faa4
MD5 7167878d6dc0a9519ca35edeeef7e626
BLAKE2b-256 c8cad4c7dbadecdfc8a9a9a4f23736eee5b844a2fdfcf8efc1f3e618c874bbe0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page