Skip to main content

Mistral Voxtral plugin for the cjm-transcription-plugin-system library - provides local speech-to-text transcription through vLLM with configurable model selection and parameter control.

Project description

cjm-transcription-plugin-voxtral-vllm

Install

pip install cjm_transcription_plugin_voxtral_vllm

Project Structure

nbs/
├── meta.ipynb   # Metadata introspection for the Voxtral VLLM plugin used by cjm-ctl to generate the registration manifest.
└── plugin.ipynb # Plugin implementation for Mistral Voxtral transcription through vLLM server

Total: 2 notebooks

Module Dependencies

graph LR
    meta[meta<br/>Metadata]
    plugin[plugin<br/>Voxtral VLLM Plugin]

    plugin --> meta

1 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Metadata (meta.ipynb)

Metadata introspection for the Voxtral VLLM plugin used by cjm-ctl to generate the registration manifest.

Import

from cjm_transcription_plugin_voxtral_vllm.meta import (
    get_plugin_metadata
)

Functions

def get_plugin_metadata() -> Dict[str, Any]: # Plugin metadata for manifest generation
    """Return metadata required to register this plugin with the PluginManager."""
    # Calculate default DB path relative to the environment
    base_path = os.path.dirname(os.path.dirname(sys.executable))
    data_dir = os.path.join(base_path, "data")
    db_path = os.path.join(data_dir, "voxtral_vllm_transcriptions.db")
    
    # Ensure data directory exists
    os.makedirs(data_dir, exist_ok=True)

    return {
        "name": "cjm-transcription-plugin-voxtral-vllm",
    "Return metadata required to register this plugin with the PluginManager."

Voxtral VLLM Plugin (plugin.ipynb)

Plugin implementation for Mistral Voxtral transcription through vLLM server

Import

from cjm_transcription_plugin_voxtral_vllm.plugin import (
    VLLMServer,
    VoxtralVLLMPluginConfig,
    VoxtralVLLMPlugin
)

Functions

@patch
def supports_streaming(
    self: VoxtralVLLMPlugin # The plugin instance
) -> bool: # True if streaming is supported
    "Check if this plugin supports streaming transcription."
@patch
def execute_stream(
    self: VoxtralVLLMPlugin, # The plugin instance
    audio: Union[AudioData, str, Path], # Audio data or path to audio file
    **kwargs # Additional plugin-specific parameters
) -> Generator[str, None, TranscriptionResult]: # Yields text chunks, returns final result
    "Stream transcription results chunk by chunk."

Classes

class VLLMServer:
    def __init__(
        self,
        model: str = "mistralai/Voxtral-Mini-3B-2507", # Model name to serve
        port: int = 8000, # Port for the server
        host: str = "0.0.0.0", # Host address to bind to
        gpu_memory_utilization: float = 0.85, # Fraction of GPU memory to use
        log_level: str = "INFO", # Logging level (DEBUG, INFO, WARNING, ERROR)
        capture_logs: bool = True, # Whether to capture and display server logs
        **kwargs # Additional vLLM server arguments
    )
    "vLLM server manager for Voxtral models."
    
    def __init__(
            self,
            model: str = "mistralai/Voxtral-Mini-3B-2507", # Model name to serve
            port: int = 8000, # Port for the server
            host: str = "0.0.0.0", # Host address to bind to
            gpu_memory_utilization: float = 0.85, # Fraction of GPU memory to use
            log_level: str = "INFO", # Logging level (DEBUG, INFO, WARNING, ERROR)
            capture_logs: bool = True, # Whether to capture and display server logs
            **kwargs # Additional vLLM server arguments
        )
    
    def add_log_callback(
            self, 
            callback: Callable[[str], None] # Function that receives log line strings
        ) -> None: # Returns nothing
        "Add a callback function to receive each log line."
    
    def start(
            self, 
            wait_for_ready: bool = True, # Wait for server to be ready before returning
            timeout: int = 120, # Maximum seconds to wait for server readiness
            show_progress: bool = True # Show progress indicators during startup
        ) -> None: # Returns nothing
        "Start the vLLM server."
    
    def stop(self) -> None: # Returns nothing
            """Stop the vLLM server."""
            if self.process and self.process.poll() is None
        "Stop the vLLM server."
    
    def restart(self) -> None: # Returns nothing
            """Restart the server."""
            self.stop()
            time.sleep(2)
            self.start()
        
        def is_running(self) -> bool: # True if server is running and responsive
        "Restart the server."
    
    def is_running(self) -> bool: # True if server is running and responsive
        "Check if server is running and responsive."
    
    def get_recent_logs(
            self, 
            n: int = 100 # Number of recent log lines to retrieve
        ) -> List[str]: # List of recent log lines
        "Get the most recent n log lines."
    
    def get_metrics_from_logs(self) -> dict: # Dictionary with performance metrics
            """Parse recent logs to extract performance metrics."""
            metrics = {
                "prompt_throughput": 0.0,
        "Parse recent logs to extract performance metrics."
    
    def tail_logs(
            self, 
            follow: bool = True, # Continue displaying new logs as they arrive
            n: int = 10 # Number of initial lines to display
        ) -> None: # Returns nothing
        "Tail the server logs (similar to tail -f)."
@dataclass
class VoxtralVLLMPluginConfig:
    "Configuration for Voxtral VLLM transcription plugin."
    
    model_id: str = field(...)
    device: str = field(...)
    server_mode: str = field(...)
    server_url: str = field(...)
    server_port: int = field(...)
    gpu_memory_utilization: float = field(...)
    max_model_len: int = field(...)
    language: Optional[str] = field(...)
    temperature: float = field(...)
    streaming: bool = field(...)
    server_startup_timeout: int = field(...)
    auto_start_server: bool = field(...)
    capture_server_logs: bool = field(...)
    dtype: str = field(...)
    tensor_parallel_size: int = field(...)
class VoxtralVLLMPlugin:
    def __init__(self):
        """Initialize the Voxtral VLLM plugin with default configuration."""
        self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
        self.config: VoxtralVLLMPluginConfig = None
    "Mistral Voxtral transcription plugin via vLLM server."
    
    def __init__(self):
            """Initialize the Voxtral VLLM plugin with default configuration."""
            self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
            self.config: VoxtralVLLMPluginConfig = None
        "Initialize the Voxtral VLLM plugin with default configuration."
    
    def name(self) -> str: # The plugin name identifier
            """Get the plugin name identifier."""
            return "voxtral_vllm"
        
        @property
        def version(self) -> str: # The plugin version string
        "Get the plugin name identifier."
    
    def version(self) -> str: # The plugin version string
            """Get the plugin version string."""
            return "1.0.0"
        
        @property
        def supported_formats(self) -> List[str]: # List of supported audio formats
        "Get the plugin version string."
    
    def supported_formats(self) -> List[str]: # List of supported audio formats
            """Get the list of supported audio file formats."""
            return ["wav", "mp3", "flac", "m4a", "ogg", "webm", "mp4", "avi", "mov"]
        
        def get_current_config(self) -> Dict[str, Any]: # Current configuration as dictionary
        "Get the list of supported audio file formats."
    
    def get_current_config(self) -> Dict[str, Any]: # Current configuration as dictionary
            """Return current configuration state."""
            if not self.config
        "Return current configuration state."
    
    def get_config_schema(self) -> Dict[str, Any]: # JSON Schema for configuration
            """Return JSON Schema for UI generation."""
            return dataclass_to_jsonschema(VoxtralVLLMPluginConfig)
    
        @staticmethod
        def get_config_dataclass() -> VoxtralVLLMPluginConfig: # Configuration dataclass
        "Return JSON Schema for UI generation."
    
    def get_config_dataclass() -> VoxtralVLLMPluginConfig: # Configuration dataclass
            """Return dataclass describing the plugin's configuration options."""
            return VoxtralVLLMPluginConfig
        
        def initialize(
            self,
            config: Optional[Any] = None # Configuration dataclass, dict, or None
        ) -> None
        "Return dataclass describing the plugin's configuration options."
    
    def initialize(
            self,
            config: Optional[Any] = None # Configuration dataclass, dict, or None
        ) -> None
        "Initialize or re-configure the plugin (idempotent)."
    
    def execute(
            self,
            audio: Union[AudioData, str, Path], # Audio data or path to audio file to transcribe
            **kwargs # Additional arguments to override config
        ) -> TranscriptionResult: # Transcription result with text and metadata
        "Transcribe audio using Voxtral via vLLM."
    
    def is_available(self) -> bool: # True if vLLM and dependencies are available
            """Check if vLLM and required dependencies are available."""
            if not OPENAI_AVAILABLE
        "Check if vLLM and required dependencies are available."
    
    def cleanup(self) -> None:
            """Clean up resources."""
            self.logger.info("Cleaning up Voxtral VLLM plugin")
            
            # Stop managed server if running
            if self.config and self.config.server_mode == "managed" and self.server
        "Clean up resources."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cjm_transcription_plugin_voxtral_vllm-0.0.11.tar.gz.

File metadata

File hashes

Hashes for cjm_transcription_plugin_voxtral_vllm-0.0.11.tar.gz
Algorithm Hash digest
SHA256 ddd4770d2d80b39ffb0adb801b68db14964c02607fe743fa018d38b14139faba
MD5 343e1ec5f2696d0823ae926b1582d19c
BLAKE2b-256 ee137d705fd7e86c9ac76ed07e2ace5d8a0f4cdc1c585508ce5cae80a89f7aaf

See more details on using hashes here.

File details

Details for the file cjm_transcription_plugin_voxtral_vllm-0.0.11-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcription_plugin_voxtral_vllm-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 4f42d965178cac3109ff16b5418a99e9d913630676a7b16d7f095a0c1a317bfd
MD5 e20703681984a459151c42981698a31e
BLAKE2b-256 56fbdb5b894f8dce70475aaa53863416729b03a87df4d91482043812a97a6518

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page