Skip to main content

A flexible plugin system for audio transcription intended to make it easy to add support for multiple backends.

Project description

cjm-transcription-plugin-system

Install

pip install cjm_transcription_plugin_system

Project Structure

nbs/
├── core.ipynb             # DTOs for audio transcription with FileBackedDTO support for zero-copy transfer
├── plugin_interface.ipynb # Domain-specific plugin interface for audio transcription
└── storage.ipynb          # Standardized SQLite storage for transcription results with content hashing

Total: 3 notebooks

Module Dependencies

graph LR
    core[core<br/>Core Data Structures]
    plugin_interface[plugin_interface<br/>Transcription Plugin Interface]
    storage[storage<br/>Transcription Storage]

    plugin_interface --> core

1 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Core Data Structures (core.ipynb)

DTOs for audio transcription with FileBackedDTO support for zero-copy transfer

Import

from cjm_transcription_plugin_system.core import (
    AudioData,
    TranscriptionResult
)

Classes

@dataclass
class AudioData:
    """
    Container for raw audio data.
    Implements FileBackedDTO for zero-copy transfer between Host and Worker processes.
    """
    
    samples: np.ndarray  # Audio sample data as numpy array
    sample_rate: int  # Sample rate in Hz (e.g., 16000, 44100)
    
    def to_temp_file(self) -> str: # Absolute path to temporary WAV file
            """Save audio to a temp file for zero-copy transfer to Worker process."""
            # Create temp file (delete=False so Worker can read it)
            tmp = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
            
            # Ensure float32 format
            audio = self.samples
            if audio.dtype != np.float32
        "Save audio to a temp file for zero-copy transfer to Worker process."
    
    def to_dict(self) -> Dict[str, Any]: # Serialized representation
            """Convert to dictionary for smaller payloads."""
            return {
                "samples": self.samples.tolist(),
        "Convert to dictionary for smaller payloads."
    
    def from_file(
            cls,
            filepath: str # Path to audio file
        ) -> "AudioData": # AudioData instance
        "Load audio from a file."
@dataclass
class TranscriptionResult:
    "Standardized output for all transcription plugins."
    
    text: str  # The transcribed text
    confidence: Optional[float]  # Overall confidence (0.0 to 1.0)
    segments: Optional[List[Dict[str, Any]]]  # Timestamped segments
    metadata: Dict[str, Any] = field(...)  # Additional metadata

Transcription Plugin Interface (plugin_interface.ipynb)

Domain-specific plugin interface for audio transcription

Import

from cjm_transcription_plugin_system.plugin_interface import (
    TranscriptionPlugin
)

Classes

class TranscriptionPlugin(PluginInterface):
    """
    Abstract base class for all transcription plugins.
    
    Extends PluginInterface with transcription-specific requirements:
    - `supported_formats`: List of audio file extensions this plugin can handle
    - `execute`: Accepts audio path (str) or AudioData, returns TranscriptionResult
    
    NOTE: When running via RemotePluginProxy, AudioData objects are automatically
    serialized to temp files via FileBackedDTO, so the Worker receives a file path.
    """
    
    def supported_formats(self) -> List[str]: # e.g., ['wav', 'mp3', 'flac']
            """List of supported audio file extensions (without the dot)."""
            ...
    
        @abstractmethod
        def execute(
            self,
            audio: Union[AudioData, str, Path], # Audio data or file path
            **kwargs
        ) -> TranscriptionResult: # Transcription result with text, confidence, segments
        "List of supported audio file extensions (without the dot)."
    
    def execute(
            self,
            audio: Union[AudioData, str, Path], # Audio data or file path
            **kwargs
        ) -> TranscriptionResult: # Transcription result with text, confidence, segments
        "Transcribe audio to text.

When called via Proxy, AudioData is auto-converted to a file path string
before reaching this method in the Worker process."

Transcription Storage (storage.ipynb)

Standardized SQLite storage for transcription results with content hashing

Import

from cjm_transcription_plugin_system.storage import (
    TranscriptionRow,
    TranscriptionStorage
)

Classes

@dataclass
class TranscriptionRow:
    "A single row from the transcriptions table."
    
    job_id: str  # Unique job identifier
    audio_path: str  # Path to the source audio file
    audio_hash: str  # Hash of source audio in "algo:hexdigest" format
    text: str  # Transcribed text output
    text_hash: str  # Hash of transcribed text in "algo:hexdigest" format
    segments: Optional[List[Dict[str, Any]]]  # Timestamped segments
    metadata: Optional[Dict[str, Any]]  # Plugin metadata
    created_at: Optional[float]  # Unix timestamp
class TranscriptionStorage:
    def __init__(
        self,
        db_path: str  # Absolute path to the SQLite database file
    )
    "Standardized SQLite storage for transcription results."
    
    def __init__(
            self,
            db_path: str  # Absolute path to the SQLite database file
        )
        "Initialize storage and create table if needed."
    
    def save(
            self,
            job_id: str,        # Unique job identifier
            audio_path: str,    # Path to the source audio file
            audio_hash: str,    # Hash of source audio in "algo:hexdigest" format
            text: str,          # Transcribed text output
            text_hash: str,     # Hash of transcribed text in "algo:hexdigest" format
            segments: Optional[List[Dict[str, Any]]] = None,  # Timestamped segments
            metadata: Optional[Dict[str, Any]] = None         # Plugin metadata
        ) -> None
        "Save a transcription result to the database."
    
    def get_by_job_id(
            self,
            job_id: str  # Job identifier to look up
        ) -> Optional[TranscriptionRow]:  # Row or None if not found
        "Retrieve a transcription result by job ID."
    
    def list_jobs(
            self,
            limit: int = 100  # Maximum number of rows to return
        ) -> List[TranscriptionRow]:  # List of transcription rows
        "List transcription jobs ordered by creation time (newest first)."
    
    def verify_audio(
            self,
            job_id: str  # Job identifier to verify
        ) -> Optional[bool]:  # True if audio matches, False if tampered, None if job not found
        "Verify the source audio file still matches its stored hash."
    
    def verify_text(
            self,
            job_id: str  # Job identifier to verify
        ) -> Optional[bool]:  # True if text matches, False if tampered, None if job not found
        "Verify the transcription text still matches its stored hash."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_transcription_plugin_system-0.0.13.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cjm_transcription_plugin_system-0.0.13.tar.gz.

File metadata

File hashes

Hashes for cjm_transcription_plugin_system-0.0.13.tar.gz
Algorithm Hash digest
SHA256 75b27e774ec1667d48e1eca5c497ceb2a56b04c761eb1905bd0a7d10507c6c88
MD5 e1263c12f93beeb5d028d5e0153c201f
BLAKE2b-256 1b78cce94a4bf9e9c103680a85fa07dae0da6695c17a75c91396d3c81dceb025

See more details on using hashes here.

File details

Details for the file cjm_transcription_plugin_system-0.0.13-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcription_plugin_system-0.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 9129c982d839f6b1ee644a775fc4dc7678e898b58d851e9927ae00c1a7b9ef9b
MD5 922e4743e940250ad32e350e9368ef51
BLAKE2b-256 0b2ea3516121b2afe334fe7714de4af0206655de7c9f8909960edb3acbf6f539

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page