Skip to main content

A flexible plugin system for audio transcription intended to make it easy to add support for multiple backends.

Project description

cjm-transcription-plugin-system

Install

pip install cjm_transcription_plugin_system

Project Structure

nbs/
├── core.ipynb             # DTOs for audio transcription with FileBackedDTO support for zero-copy transfer
└── plugin_interface.ipynb # Domain-specific plugin interface for audio transcription

Total: 2 notebooks

Module Dependencies

graph LR
    core[core<br/>Core Data Structures]
    plugin_interface[plugin_interface<br/>Transcription Plugin Interface]

    plugin_interface --> core

1 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Core Data Structures (core.ipynb)

DTOs for audio transcription with FileBackedDTO support for zero-copy transfer

Import

from cjm_transcription_plugin_system.core import (
    AudioData,
    TranscriptionResult
)

Classes

@dataclass
class AudioData:
    """
    Container for raw audio data.
    Implements FileBackedDTO for zero-copy transfer between Host and Worker processes.
    """
    
    samples: np.ndarray  # Audio sample data as numpy array
    sample_rate: int  # Sample rate in Hz (e.g., 16000, 44100)
    
    def to_temp_file(self) -> str: # Absolute path to temporary WAV file
            """Save audio to a temp file for zero-copy transfer to Worker process."""
            # Create temp file (delete=False so Worker can read it)
            tmp = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
            
            # Ensure float32 format
            audio = self.samples
            if audio.dtype != np.float32
        "Save audio to a temp file for zero-copy transfer to Worker process."
    
    def to_dict(self) -> Dict[str, Any]: # Serialized representation
            """Convert to dictionary for smaller payloads."""
            return {
                "samples": self.samples.tolist(),
        "Convert to dictionary for smaller payloads."
    
    def from_file(
            cls,
            filepath: str # Path to audio file
        ) -> "AudioData": # AudioData instance
        "Load audio from a file."
@dataclass
class TranscriptionResult:
    "Standardized output for all transcription plugins."
    
    text: str  # The transcribed text
    confidence: Optional[float]  # Overall confidence (0.0 to 1.0)
    segments: Optional[List[Dict[str, Any]]]  # Timestamped segments
    metadata: Dict[str, Any] = field(...)  # Additional metadata

Transcription Plugin Interface (plugin_interface.ipynb)

Domain-specific plugin interface for audio transcription

Import

from cjm_transcription_plugin_system.plugin_interface import (
    TranscriptionPlugin
)

Classes

class TranscriptionPlugin(PluginInterface):
    """
    Abstract base class for all transcription plugins.
    
    Extends PluginInterface with transcription-specific requirements:
    - `supported_formats`: List of audio file extensions this plugin can handle
    - `execute`: Accepts audio path (str) or AudioData, returns TranscriptionResult
    
    NOTE: When running via RemotePluginProxy, AudioData objects are automatically
    serialized to temp files via FileBackedDTO, so the Worker receives a file path.
    """
    
    def supported_formats(self) -> List[str]: # e.g., ['wav', 'mp3', 'flac']
            """List of supported audio file extensions (without the dot)."""
            ...
    
        @abstractmethod
        def execute(
            self,
            audio: Union[AudioData, str, Path], # Audio data or file path
            **kwargs
        ) -> TranscriptionResult: # Transcription result with text, confidence, segments
        "List of supported audio file extensions (without the dot)."
    
    def execute(
            self,
            audio: Union[AudioData, str, Path], # Audio data or file path
            **kwargs
        ) -> TranscriptionResult: # Transcription result with text, confidence, segments
        "Transcribe audio to text.

When called via Proxy, AudioData is auto-converted to a file path string
before reaching this method in the Worker process."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_transcription_plugin_system-0.0.12.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cjm_transcription_plugin_system-0.0.12.tar.gz.

File metadata

File hashes

Hashes for cjm_transcription_plugin_system-0.0.12.tar.gz
Algorithm Hash digest
SHA256 f0ba81b819f099485498d1f1e00ab735f6054611242e54b3eded971aa1407f5a
MD5 b25657b3ed7f553397816677f182b86a
BLAKE2b-256 e5acdfee33056fee9cd2aba3051969885e791ee91adddc2ea61ab54b7aba7339

See more details on using hashes here.

File details

Details for the file cjm_transcription_plugin_system-0.0.12-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcription_plugin_system-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 3aad5fa7938a7f5dfa465f0e1095199502911048b4ede6e918b7c6973f9c6438
MD5 203495ef5a24d243d46bfcd399f9b188
BLAKE2b-256 7157c7a2ed57799bb5a052174b277b53f6cf8737fdc83e6e247b01ca2d2dfb20

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page