Skip to main content

A self-contained single-file transcription workflow for FastHTML applications.

Project description

cjm-fasthtml-workflow-transcription-single-file

Install

pip install cjm_fasthtml_workflow_transcription_single_file

Project Structure

nbs/
├── components/ (3)
│   ├── processor.ipynb  # UI component for displaying transcription in-progress state
│   ├── results.ipynb    # UI components for displaying transcription results and errors
│   └── steps.ipynb      # UI components for workflow step rendering (plugin selection, file selection, confirmation)
├── core/ (5)
│   ├── adapters.ipynb     # Adapter implementations for integrating with plugin registries
│   ├── config.ipynb       # Configuration dataclass for single-file transcription workflow
│   ├── html_ids.ipynb     # Centralized HTML ID constants for single-file transcription workflow components
│   ├── job_tracker.ipynb  # Lightweight job state tracking for transcription workflows
│   └── protocols.ipynb    # Protocol definitions for external dependencies and plugin integration
├── settings/ (2)
│   ├── components.ipynb  # UI components for workflow settings modal and forms
│   └── schemas.ipynb     # JSON schemas and utilities for workflow settings
├── storage/ (3)
│   ├── config.ipynb        # Configuration for transcription result storage
│   ├── file_storage.ipynb  # File-based storage for transcription results
│   └── protocols.ipynb     # Protocol definitions for result storage backends
└── workflow/ (3)
    ├── job_handler.ipynb  # Functions for starting transcription jobs and handling SSE streaming
    ├── routes.ipynb       # Route initialization and handlers for the single-file transcription workflow
    └── workflow.ipynb     # Main workflow class orchestrating all subsystems for single-file transcription

Total: 16 notebooks across 5 directories

Module Dependencies

graph LR
    components_processor[components.processor<br/>Processor Component]
    components_results[components.results<br/>Results Components]
    components_steps[components.steps<br/>Step Components]
    core_adapters[core.adapters<br/>Adapters]
    core_config[core.config<br/>Configuration]
    core_html_ids[core.html_ids<br/>HTML IDs]
    core_job_tracker[core.job_tracker<br/>Job Tracker]
    core_protocols[core.protocols<br/>Protocols]
    settings_components[settings.components<br/>Settings Components]
    settings_schemas[settings.schemas<br/>Settings Schemas]
    storage_config[storage.config<br/>Storage Configuration]
    storage_file_storage[storage.file_storage<br/>Result Storage]
    storage_protocols[storage.protocols<br/>Storage Protocols]
    workflow_job_handler[workflow.job_handler<br/>Job Handler]
    workflow_routes[workflow.routes<br/>Workflow Routes]
    workflow_workflow[workflow.workflow<br/>Single File Transcription Workflow]

    components_processor --> core_config
    components_processor --> core_html_ids
    components_results --> core_config
    components_results --> core_html_ids
    components_steps --> core_config
    components_steps --> core_html_ids
    components_steps --> core_protocols
    core_adapters --> core_protocols
    core_adapters --> core_config
    core_config --> core_html_ids
    core_config --> storage_config
    settings_schemas --> core_config
    settings_schemas --> storage_config
    storage_file_storage --> storage_config
    workflow_job_handler --> core_job_tracker
    workflow_job_handler --> components_processor
    workflow_job_handler --> workflow_workflow
    workflow_job_handler --> core_protocols
    workflow_job_handler --> components_results
    workflow_job_handler --> core_config
    workflow_job_handler --> core_html_ids
    workflow_job_handler --> storage_file_storage
    workflow_routes --> components_processor
    workflow_routes --> workflow_job_handler
    workflow_routes --> workflow_workflow
    workflow_routes --> components_results
    workflow_routes --> components_steps
    workflow_routes --> core_html_ids
    workflow_workflow --> core_job_tracker
    workflow_workflow --> storage_file_storage
    workflow_workflow --> components_steps
    workflow_workflow --> core_config
    workflow_workflow --> core_html_ids
    workflow_workflow --> core_adapters

34 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Adapters (adapters.ipynb)

Adapter implementations for integrating with plugin registries

Import

from cjm_fasthtml_workflow_transcription_single_file.core.adapters import (
    PluginRegistryAdapter
)

Classes

class PluginRegistryAdapter:
    def __init__(self,
                 plugin_manager: PluginManager,  # The PluginManager instance to wrap
                 config: SingleFileWorkflowConfig, # The workflow config instance
                 category: str = "transcription"  # Plugin category to filter by
                 )
    "Adapts PluginManager to workflow's PluginRegistryProtocol."
    
    def __init__(self,
                     plugin_manager: PluginManager,  # The PluginManager instance to wrap
                     config: SingleFileWorkflowConfig, # The workflow config instance
                     category: str = "transcription"  # Plugin category to filter by
                     )
        "Initialize the adapter."
    
    def get_configured_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for configured plugins
            """Get all configured transcription plugins (all discovered are considered configured)."""
            metas = self._manager.get_discovered_by_category(self._category)
            return [self._meta_to_info(meta) for meta in metas]
    
        def get_all_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for all discovered plugins
        "Get all configured transcription plugins (all discovered are considered configured)."
    
    def get_all_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for all discovered plugins
            """Get all discovered transcription plugins."""
            metas = self._manager.get_discovered_by_category(self._category)
            return [self._meta_to_info(meta) for meta in metas]
    
        def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier (name)
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get all discovered transcription plugins."
    
    def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier (name)
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get a specific plugin by ID."
    
    def get_plugin_config(self,
                              plugin_id: str  # Unique plugin identifier
                              ) -> Dict[str, Any]:  # Configuration dictionary
        "Get the configuration for a plugin."

Settings Components (components.ipynb)

UI components for workflow settings modal and forms

Import

from cjm_fasthtml_workflow_transcription_single_file.settings.components import (
    settings_trigger_button,
    simple_settings_form,
    settings_modal
)

Functions

def settings_trigger_button(
    modal_id:str,                  # ID of the modal to trigger
    label:str="Settings",          # Button label text
    button_cls:Optional[str]=None  # Optional additional button classes
) -> FT:                           # Button element that triggers the modal
    "Create a button that opens the settings modal."
def simple_settings_form(
    directories:list,        # List of media directories
    auto_save:bool,          # Current auto-save setting
    results_directory:str,   # Current results directory
    save_url:str,            # URL to POST settings to
    target_id:str,           # Target element ID for HTMX response
    modal_id:str             # Modal ID for close button
) -> FT:                     # Simple form element
    "Create a simple settings form without full schema generation."
def settings_modal(
    modal_id:str,                    # ID for the modal element
    schema:Dict[str, Any],           # JSON schema for settings
    current_values:Dict[str, Any],   # Current settings values
    save_url:str,                    # URL to POST settings to
    target_id:str                    # Target element ID for HTMX response
) -> FT:                             # Modal dialog with settings form
    "Create the settings modal with form."

Configuration (config.ipynb)

Configuration dataclass for single-file transcription workflow

Import

from cjm_fasthtml_workflow_transcription_single_file.core.config import (
    DEFAULT_WORKFLOW_CONFIG_DIR,
    SingleFileWorkflowConfig
)

Classes

@dataclass
class SingleFileWorkflowConfig:
    "Configuration for single-file transcription workflow."
    
    workflow_id: str = 'single_file_transcription'  # Unique identifier for this workflow
    worker_type: str = 'transcription:single_file'  # Worker process type identifier
    route_prefix: str = '/single_file'  # Base URL prefix for workflow routes
    stepflow_prefix: str = '/flow'  # Sub-prefix for StepFlow routes
    media_prefix: str = '/media'  # Sub-prefix for media browser routes
    container_id: str = SingleFileHtmlIds.WORKFLOW_CONTAINER  # HTML ID for main workflow container
    show_progress: bool = True  # Show step progress indicator
    max_files_displayed: int = 50  # Maximum files to show in simple file selector
    export_formats: List[str] = field(...)  # Available export formats
    no_plugins_redirect: Optional[str]  # URL to redirect when no plugins configured
    no_files_redirect: Optional[str]  # URL to redirect when no media files found
    sse_poll_interval: float = 2.0  # Seconds between SSE status checks
    gpu_memory_threshold_percent: float = 45.0  # Max GPU memory % before blocking new jobs
    config_dir: Path = field(...)  # Directory for workflow settings
    plugin_config_dir: Path = field(...)  # Directory for plugin configs
    plugin_category: str = 'transcription'  # Plugin category for this workflow
    media: BrowserConfig = field(...)  # File browser settings for media files
    storage: StorageConfig = field(...)  # Result storage settings
    
    def get_full_stepflow_prefix(self) -> str:  # Combined route_prefix + stepflow_prefix
            """Get the full prefix for the StepFlow router."""
            return f"{self.route_prefix}{self.stepflow_prefix}"
    
        def get_full_media_prefix(self) -> str:  # Combined route_prefix + media_prefix
        "Get the full prefix for the StepFlow router."
    
    def get_full_media_prefix(self) -> str:  # Combined route_prefix + media_prefix
            """Get the full prefix for the media router."""
            return f"{self.route_prefix}{self.media_prefix}"
    
        @classmethod
        def from_saved_config(
            cls,
            config_dir: Optional[Path] = None,  # Directory to load config from
            **overrides  # Override specific config values
        ) -> "SingleFileWorkflowConfig":  # Configured instance with saved values merged with defaults
        "Get the full prefix for the media router."
    
    def from_saved_config(
            cls,
            config_dir: Optional[Path] = None,  # Directory to load config from
            **overrides  # Override specific config values
        ) -> "SingleFileWorkflowConfig":  # Configured instance with saved values merged with defaults
        "Create config by loading saved settings and merging with defaults."

Variables

DEFAULT_WORKFLOW_CONFIG_DIR

Storage Configuration (config.ipynb)

Configuration for transcription result storage

Import

from cjm_fasthtml_workflow_transcription_single_file.storage.config import (
    STORAGE_CONFIG_SCHEMA,
    StorageConfig
)

Classes

@dataclass
class StorageConfig:
    "Result storage configuration."
    
    __schema_name__: ClassVar[str] = 'storage'
    __schema_title__: ClassVar[str] = 'Storage Settings'
    __schema_description__: ClassVar[str] = 'Configure transcription result storage'
    auto_save: bool = field(...)
    results_directory: str = field(...)

Variables

STORAGE_CONFIG_SCHEMA

Result Storage (file_storage.ipynb)

File-based storage for transcription results

Import

from cjm_fasthtml_workflow_transcription_single_file.storage.file_storage import (
    ResultStorage
)

Functions

@patch
def should_auto_save(
    self: ResultStorage
) -> bool:  # True if results should be automatically saved
    "Check if auto-save is enabled."
@patch
def save(
    self: ResultStorage,
    job_id: str,  # Unique job identifier
    file_path: str,  # Path to the transcribed media file
    file_name: str,  # Name of the media file
    plugin_id: str,  # Plugin unique identifier
    plugin_name: str,  # Plugin display name
    text: str,  # The transcription text
    metadata: Optional[Dict[str, Any]] = None,  # Optional metadata from the transcription plugin
    additional_info: Optional[Dict[str, Any]] = None  # Optional additional information to store
) -> Path:  # Path to the saved JSON file
    "Save a transcription result to JSON file."
@patch
def load(
    self: ResultStorage,
    result_file: Path  # Path to the JSON result file
) -> Optional[Dict[str, Any]]:  # Dictionary containing the result data, or None if error
    "Load a transcription result from JSON file."
@patch
def list_results(
    self: ResultStorage,
    sort_by: str = "timestamp",  # Field to sort by ("timestamp", "file_name", "word_count")
    reverse: bool = True  # Sort in reverse order (newest first by default)
) -> List[Dict[str, Any]]:  # List of result dictionaries
    "List all saved transcription results."
@patch
def get_by_job_id(
    self: ResultStorage,
    job_id: str  # The job identifier to search for
) -> Optional[Dict[str, Any]]:  # Result dictionary if found, None otherwise
    "Find and load a transcription result by job ID."
@patch
def delete(
    self: ResultStorage,
    result_file: str  # Path to the result file (can be full path or filename)
) -> bool:  # True if deletion successful, False otherwise
    "Delete a transcription result file."
@patch
def update_text(
    self: ResultStorage,
    result_file: str,  # Path to the result file
    new_text: str  # New transcription text
) -> bool:  # True if update successful, False otherwise
    "Update the transcription text in a saved result."
@patch
def _generate_filename(
    self: ResultStorage,
    job_id: str,  # Unique job identifier
    file_name: str  # Original media file name
) -> str:  # Generated filename for the JSON result file
    "Generate a filename for storing transcription results."

Classes

class ResultStorage:
    def __init__(self,
                 config: StorageConfig  # Storage configuration
                 )
    "File-based storage for transcription results."
    
    def __init__(self,
                     config: StorageConfig  # Storage configuration
                     )
        "Initialize the storage."
    
    def results_directory(self) -> Path:  # Path to the results directory
            """Get the results directory, creating it if needed."""
            if self._results_dir is None
        "Get the results directory, creating it if needed."

HTML IDs (html_ids.ipynb)

Centralized HTML ID constants for single-file transcription workflow components

Import

from cjm_fasthtml_workflow_transcription_single_file.core.html_ids import (
    SingleFileHtmlIds
)

Classes

class SingleFileHtmlIds(InteractionHtmlIds):
    "HTML ID constants for single-file transcription workflow."
    
    def plugin_radio(plugin_id: str  # Unique plugin identifier to generate ID for
                         ) -> str:  # HTML ID for the plugin radio button
        "Generate HTML ID for a plugin radio button."
    
    def file_radio(index: int  # File index in the selection list
                       ) -> str:  # HTML ID for the file radio button
        "Generate HTML ID for a file radio button."

Job Handler (job_handler.ipynb)

Functions for starting transcription jobs and handling SSE streaming

Import

from cjm_fasthtml_workflow_transcription_single_file.workflow.job_handler import (
    get_job_session_info,
    start_transcription_job,
    create_job_stream_handler
)

Functions

def get_job_session_info(
    job_id: str,  # Unique job identifier
    job: TranscriptionJob,  # Job object from the tracker
    plugin_manager: PluginManager,  # Plugin manager for getting plugin info
) -> tuple[Dict[str, Any], Dict[str, Any]]:  # Tuple of (file_info, plugin_info) dictionaries
    "Get file and plugin info from job object and plugin manager."
def _save_job_result_once(
    job_id: str,  # Job identifier
    job: TranscriptionJob,  # Job object
    data: Dict[str, Any],  # Transcription data containing text and metadata
    plugin_manager: PluginManager,  # Plugin manager for getting plugin info
    result_storage: ResultStorage,  # Storage for saving transcription results
) -> None
    """
    Save transcription result to disk, ensuring it's only saved once per job.
    
    Called from the SSE stream handler as a fallback. The primary save mechanism
    is the workflow's `_on_job_completed` callback called by TranscriptionJobTracker.
    """
def _create_sse_swap_message(
    content,  # HTML content to wrap
    container_id: str,  # Target container ID for the swap
):  # Div with OOB swap attributes
    "Wrap content in a Div with HTMX OOB swap for SSE messages."
async def start_transcription_job(
    state: Dict[str, Any],  # Workflow state containing plugin_id, file_path, file_name, etc.
    request,  # FastHTML request object
    workflow: SingleFileTranscriptionWorkflow,  # Workflow instance providing config and dependencies
):  # transcription_in_progress component showing job status
    "Start a transcription job and return the in-progress UI component."
def create_job_stream_handler(
    job_id: str,  # Unique job identifier
    request,  # FastHTML request object
    workflow: SingleFileTranscriptionWorkflow,  # Workflow instance providing config and dependencies
):  # Async generator for SSE streaming
    "Create an SSE stream generator for monitoring job completion."

Job Tracker (job_tracker.ipynb)

Lightweight job state tracking for transcription workflows

Import

from cjm_fasthtml_workflow_transcription_single_file.core.job_tracker import (
    TranscriptionJob,
    TranscriptionJobTracker
)

Classes

@dataclass
class TranscriptionJob:
    "Represents a transcription job's state."
    
    id: str  # Unique job identifier (UUID)
    plugin_name: str  # Plugin name for execution
    file_path: str  # Path to the audio/video file
    file_name: str  # Display name of the file
    status: str = 'pending'  # Job status: pending, running, completed, failed, cancelled
    created_at: str = field(...)  # ISO timestamp
    started_at: Optional[str]  # When execution began
    completed_at: Optional[str]  # When job finished
    result: Optional[Dict[str, Any]]  # Transcription result data
    error: Optional[str]  # Error message if failed
    metadata: Dict[str, Any] = field(...)  # Additional job metadata
    task: Optional[asyncio.Task]  # Async task handle for cancellation
class TranscriptionJobTracker:
    def __init__(
        self,
        on_job_completed: Optional[Callable[[str, 'TranscriptionJobTracker'], None]] = None,  # Completion callback
    )
    "Lightweight job state tracker for transcription workflows."
    
    def __init__(
            self,
            on_job_completed: Optional[Callable[[str, 'TranscriptionJobTracker'], None]] = None,  # Completion callback
        )
        "Initialize the job tracker."
    
    def create_job(
            self,
            plugin_name: str,  # Name of the plugin to execute
            file_path: str,  # Path to audio/video file
            file_name: str,  # Display name of the file
            **metadata  # Additional job metadata
        ) -> TranscriptionJob:  # Created job instance
        "Create a new transcription job."
    
    def mark_running(
            self,
            job_id: str,  # Job identifier
            task: Optional[asyncio.Task] = None  # Async task handle
        ) -> None
        "Mark a job as running."
    
    def mark_completed(
            self,
            job_id: str,  # Job identifier
            result: Dict[str, Any]  # Transcription result
        ) -> None
        "Mark a job as completed with result."
    
    def mark_failed(
            self,
            job_id: str,  # Job identifier
            error: str  # Error message
        ) -> None
        "Mark a job as failed with error."
    
    async def cancel_job(
            self,
            job_id: str  # Job identifier
        ) -> bool:  # True if cancellation was successful
        "Cancel a running job."
    
    def get_job(
            self,
            job_id: str  # Job identifier
        ) -> Optional[TranscriptionJob]:  # Job instance or None
        "Get a job by ID."
    
    def get_job_result(
            self,
            job_id: str  # Job identifier
        ) -> Optional[Dict[str, Any]]:  # Result dict or None
        "Get a job's result."
    
    def get_running_jobs(self) -> List[TranscriptionJob]:  # List of running jobs
            """Get all currently running jobs."""
            return [job for job in self.jobs.values() if job.status == "running"]
        
        def clear_completed(
            self,
            keep_results: bool = False  # Whether to keep results in memory
        ) -> int:  # Number of jobs cleared
        "Get all currently running jobs."
    
    def clear_completed(
            self,
            keep_results: bool = False  # Whether to keep results in memory
        ) -> int:  # Number of jobs cleared
        "Clear completed, failed, and cancelled jobs."

Processor Component (processor.ipynb)

UI component for displaying transcription in-progress state

Import

from cjm_fasthtml_workflow_transcription_single_file.components.processor import (
    transcription_in_progress
)

Functions

def transcription_in_progress(
    job_id: str, # Unique identifier for the transcription job
    plugin_info: Dict[str, Any], # Dictionary with plugin details (id, title, supports_streaming)
    file_info: Dict[str, Any], # Dictionary with file details (name, path, type, size_str)
    config: SingleFileWorkflowConfig, # Workflow configuration
    router: APIRouter, # Workflow router for generating route URLs
) -> FT: # FastHTML component showing progress and SSE connection
    "Render transcription in-progress view with SSE updates."

Protocols (protocols.ipynb)

Protocol definitions for external dependencies and plugin integration

Import

from cjm_fasthtml_workflow_transcription_single_file.core.protocols import (
    PluginInfo,
    PluginRegistryProtocol
)

Classes

@dataclass
class PluginInfo:
    "Information about a transcription plugin."
    
    id: str  # Unique plugin identifier (e.g., "transcription:voxtral_hf")
    name: str  # Plugin name (e.g., "voxtral_hf")
    title: str  # Display title (e.g., "Voxtral HF")
    is_configured: bool  # Whether the plugin has a valid configuration
    supports_streaming: bool = False  # Whether the plugin supports streaming output
@runtime_checkable
class PluginRegistryProtocol(Protocol):
    "Protocol for plugin registry access."
    
    def get_configured_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for configured plugins
            """Get all configured transcription plugins."""
            ...
    
        def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get all configured transcription plugins."
    
    def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get a specific plugin by ID."
    
    def get_plugin_config(self,
                              plugin_id: str  # Unique plugin identifier
                              ) -> Dict[str, Any]:  # Configuration dictionary, empty dict if not configured
        "Get the configuration for a plugin."

Storage Protocols (protocols.ipynb)

Protocol definitions for result storage backends

Import

from cjm_fasthtml_workflow_transcription_single_file.storage.protocols import (
    ResultStorageProtocol
)

Classes

@runtime_checkable
class ResultStorageProtocol(Protocol):
    "Protocol for transcription result storage backends."
    
    def should_auto_save(self) -> bool:  # True if results should be automatically saved
            """Check if auto-save is enabled."""
            ...
    
        def save(
            self,
            job_id: str,  # Unique job identifier
            file_path: str,  # Path to the transcribed media file
            file_name: str,  # Name of the media file
            plugin_id: str,  # Plugin unique identifier
            plugin_name: str,  # Plugin display name
            text: str,  # The transcription text
            metadata: Optional[Dict[str, Any]] = None,  # Optional metadata from the transcription plugin
            additional_info: Optional[Dict[str, Any]] = None  # Optional additional information to store
        ) -> Any:  # Implementation-specific return value (e.g., Path for file storage, ID for database)
        "Check if auto-save is enabled."
    
    def save(
            self,
            job_id: str,  # Unique job identifier
            file_path: str,  # Path to the transcribed media file
            file_name: str,  # Name of the media file
            plugin_id: str,  # Plugin unique identifier
            plugin_name: str,  # Plugin display name
            text: str,  # The transcription text
            metadata: Optional[Dict[str, Any]] = None,  # Optional metadata from the transcription plugin
            additional_info: Optional[Dict[str, Any]] = None  # Optional additional information to store
        ) -> Any:  # Implementation-specific return value (e.g., Path for file storage, ID for database)
        "Save a transcription result."
    
    def load(
            self,
            result_id: Any  # Implementation-specific identifier (e.g., Path for file storage, ID for database)
        ) -> Optional[Dict[str, Any]]:  # Result dictionary or None if not found
        "Load a transcription result by its identifier."
    
    def list_results(
            self,
            sort_by: str = "timestamp",  # Field to sort by
            reverse: bool = True  # Sort in reverse order
        ) -> List[Dict[str, Any]]:  # List of result dictionaries
        "List all saved transcription results."
    
    def get_by_job_id(
            self,
            job_id: str  # The job identifier to search for
        ) -> Optional[Dict[str, Any]]:  # Result dictionary if found, None otherwise
        "Find and load a transcription result by job ID."
    
    def delete(
            self,
            result_id: Any  # Implementation-specific identifier
        ) -> bool:  # True if deletion successful, False otherwise
        "Delete a transcription result."

Results Components (results.ipynb)

UI components for displaying transcription results and errors

Import

from cjm_fasthtml_workflow_transcription_single_file.components.results import (
    transcription_results,
    transcription_error
)

Functions

def transcription_results(
    job_id: str, # Unique identifier for the transcription job
    transcription_text: str, # The transcribed text
    metadata: Dict[str, Any], # Transcription metadata from the plugin
    file_info: Dict[str, Any], # Dictionary with file details (name, path, type, size_str)
    plugin_info: Dict[str, Any], # Dictionary with plugin details (id, title, supports_streaming)
    config: SingleFileWorkflowConfig, # Workflow configuration
    router: APIRouter, # Workflow router for generating route URLs
    stepflow_router: APIRouter, # StepFlow router for generating stepflow URLs
) -> FT: # FastHTML component showing results with export options
    "Render transcription results with export options."
def transcription_error(
    error_message: str, # Description of the error that occurred
    file_info: Optional[Dict[str, Any]], # Optional dictionary with file details
    config: SingleFileWorkflowConfig, # Workflow configuration
    stepflow_router: APIRouter, # StepFlow router for generating stepflow URLs
) -> FT: # FastHTML component showing error with retry option
    "Render transcription error message."

Workflow Routes (routes.ipynb)

Route initialization and handlers for the single-file transcription workflow

Import

from cjm_fasthtml_workflow_transcription_single_file.workflow.routes import (
    init_router
)

Functions

def _handle_current_status(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
):  # Appropriate UI component based on current state
    "Return current transcription status - determines what to show."
async def _handle_cancel_job(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
    job_id: str,  # ID of the job to cancel
):  # StepFlow start view or error component
    "Cancel a running transcription job."
def _handle_reset(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
):  # StepFlow start view
    "Reset transcription workflow and return to start."
def _handle_stream_job(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
    job_id: str,  # ID of the job to monitor
):  # EventStream for SSE updates
    "SSE endpoint for monitoring job completion."
def _handle_export(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    job_id: str,  # ID of the job to export
    format: str = "txt",  # Export format (txt, srt, vtt)
):  # Response with file download
    "Export transcription in specified format."
def _handle_plugin_details(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    plugin_id: str,  # ID of the plugin to show details for
    save_url: str,  # URL for saving plugin configuration
    reset_url: str,  # URL for resetting plugin configuration
):  # Plugin details component or empty Div
    "Get plugin details for display in workflow."
async def _handle_save_plugin_config(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    plugin_id: str,  # ID of the plugin to save config for
    save_url: str,  # URL for saving plugin configuration
    reset_url: str,  # URL for resetting plugin configuration
):  # Updated config form or error alert
    "Save plugin configuration from the collapse form."
def _handle_reset_plugin_config(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    plugin_id: str,  # ID of the plugin to reset config for
    save_url: str,  # URL for saving plugin configuration
    reset_url: str,  # URL for resetting plugin configuration
):  # Updated config form with defaults or empty Div
    "Reset plugin configuration to defaults."
def _handle_media_preview(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    idx: int = 0,  # Index of the file to preview
    file_type: str = None,  # Optional filter by file type
):  # File preview modal or error Div
    "Render file preview modal for a specific file."
def _handle_refresh_media(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
):  # JSON status response
    "Refresh file browser cache."
def _handle_settings_modal(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    save_url: str,  # URL for saving settings
):  # Settings modal component
    "Render the settings modal for the workflow."
async def _handle_settings_save(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
):  # Success alert with modal close script or error alert
    "Save workflow settings."
def init_router(
    workflow: SingleFileTranscriptionWorkflow,  # The workflow instance providing access to config and dependencies
) -> APIRouter:  # Configured APIRouter with all workflow routes
    "Initialize and return the workflow's API router with all routes."
def _export_transcription(
    text: str,  # Transcription text
    format: str,  # Export format (txt, srt, vtt)
    filename: str,  # Original filename for metadata
) -> str:  # Formatted transcription string
    "Format transcription for export."

Settings Schemas (schemas.ipynb)

JSON schemas and utilities for workflow settings

Import

from cjm_fasthtml_workflow_transcription_single_file.settings.schemas import (
    WORKFLOW_SETTINGS_SCHEMA,
    WorkflowSettings
)

Functions

def from_configs(
    cls: WorkflowSettings,
    browser_config: BrowserConfig,  # BrowserConfig instance with file browser settings
    storage_config: StorageConfig,  # StorageConfig instance with result storage settings
    workflow_config: Optional[SingleFileWorkflowConfig] = None  # Optional workflow config for additional settings
) -> "WorkflowSettings":  # WorkflowSettings instance with values from configs
    "Create WorkflowSettings from runtime config objects."
@patch
def apply_to_configs(
    self: WorkflowSettings,
    browser_config: BrowserConfig,  # BrowserConfig instance to update
    storage_config: StorageConfig,  # StorageConfig instance to update
    workflow_config: Optional[SingleFileWorkflowConfig] = None  # Optional workflow config to update
) -> None
    "Apply settings to runtime config objects."
@patch
def to_dict(
    self: WorkflowSettings
) -> Dict[str, Any]:  # Dictionary of settings values
    "Convert settings to a dictionary for serialization."

Classes

@dataclass
class WorkflowSettings:
    "User-configurable settings for single-file transcription workflow."
    
    __schema_name__: ClassVar[str] = 'single_file_workflow'
    __schema_title__: ClassVar[str] = 'Single File Transcription Settings'
    __schema_description__: ClassVar[str] = 'Configure file scanning, storage, and workflow behavior'
    directories: List[str] = field(...)
    enabled_types: List[str] = field(...)
    recursive_scan: bool = field(...)
    items_per_page: int = field(...)
    default_view: str = field(...)
    auto_save: bool = field(...)
    results_directory: str = field(...)
    gpu_memory_threshold_percent: float = field(...)

Variables

WORKFLOW_SETTINGS_SCHEMA  # Auto-generate schema from WorkflowSettings dataclass

Step Components (steps.ipynb)

UI components for workflow step rendering (plugin selection, file selection, confirmation)

Import

from cjm_fasthtml_workflow_transcription_single_file.components.steps import (
    render_plugin_config_form,
    render_plugin_details_route,
    render_plugin_selection,
    render_file_selection,
    render_confirmation
)

Functions

def _get_file_attr(
    file_path: str,  # Path to the file to look up
    files: list,  # List of file info objects to search
    attr: str,  # Attribute name to retrieve from the file
) -> str:  # Attribute value or empty string if not found
    "Get an attribute from a file by path."
def _render_plugin_details_content(
    plugin_id: str, # ID of the plugin to display details for
    plugins: List[PluginInfo], # List of available plugins
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin config
) -> Optional[FT]: # Plugin info card or None if plugin not found
    "Render details for selected plugin (info card only, no config collapse)."
def _render_plugin_details_with_config(
    plugin_id: str, # ID of the plugin to display details for
    plugins: List[PluginInfo], # List of available plugins
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin config
    plugin_manager: Optional[PluginManager], # PluginManager for config_schema access
    save_url: str, # URL for saving plugin configuration
    reset_url: str, # URL for resetting plugin configuration
) -> Optional[FT]: # Plugin details with config collapse, or None if not found
    "Render plugin details with configuration collapse for initial render."
def render_plugin_config_form(
    plugin_id: str, # ID of the plugin to render config for
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugins and config
    plugin_manager: PluginManager, # PluginManager for config_schema access
    save_url: str, # URL for saving the configuration
    reset_url: str, # URL for resetting to defaults
    alert_message: Optional[Any] = None, # Optional alert to display above the form
) -> FT: # Div containing the settings form with alert container
    "Render the plugin configuration form for the collapse content."
def _render_plugin_config_collapse(
    plugin_id: str, # ID of the plugin to render config for
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugins and config
    plugin_manager: PluginManager, # PluginManager for config_schema access
    save_url: str, # URL for saving the configuration
    reset_url: str, # URL for resetting to defaults
) -> FT: # Collapse component with plugin configuration form
    "Render a collapse component containing the plugin configuration form."
def render_plugin_details_route(
    plugin_id: str, # ID of the plugin to display details for
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugins and config
    plugin_manager: PluginManager, # PluginManager for config_schema access
    save_url: str, # URL for saving plugin configuration
    reset_url: str, # URL for resetting plugin configuration to defaults
) -> FT: # Plugin details with info card and config collapse
    "Render plugin details for HTMX route when plugin dropdown changes."
def render_plugin_selection(
    ctx: InteractionContext, # Interaction context with state and data
    config: SingleFileWorkflowConfig, # Workflow configuration
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin config
    settings_modal_url: str, # URL for the settings modal route
    plugin_details_url: str, # URL for the plugin details route
    plugin_manager: Optional[PluginManager] = None, # PluginManager for config_schema access
    save_plugin_config_url: str = "", # URL for saving plugin configuration
    reset_plugin_config_url: str = "", # URL for resetting plugin configuration
) -> FT: # Plugin selection step UI component
    "Render plugin selection step showing all discovered plugins."
def render_file_selection(
    ctx: InteractionContext,  # Interaction context with state and data
    config: SingleFileWorkflowConfig,  # Workflow configuration
    file_selection_router: APIRouter,  # Router for file selection pagination (or None)
) -> FT:  # File selection step UI component with paginated table
    "Render file selection step with paginated table view and preview capability."
def render_confirmation(
    ctx: InteractionContext, # Interaction context with state and data
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin info
) -> FT: # Confirmation step UI component showing selected plugin and file
    "Render confirmation step showing selected plugin and file."

Single File Transcription Workflow (workflow.ipynb)

Main workflow class orchestrating all subsystems for single-file transcription

Import

from cjm_fasthtml_workflow_transcription_single_file.workflow.workflow import (
    SingleFileTranscriptionWorkflow
)

Functions

@patch
def setup(
    self: SingleFileTranscriptionWorkflow,
    app,  # FastHTML application instance
) -> None
    "Initialize workflow with FastHTML app. Must be called after app creation."
@patch
def cleanup(
    self: SingleFileTranscriptionWorkflow,
) -> None
    "Clean up workflow resources. Mirrors PluginInterface.cleanup() for future plugin system compatibility."
@patch
def get_routers(
    self: SingleFileTranscriptionWorkflow,
) -> List[APIRouter]:  # List containing main router, stepflow router, media router, and file selection router
    "Return all routers for registration with the app."
@patch
def render_entry_point(
    self: SingleFileTranscriptionWorkflow,
    request,  # FastHTML request object
    sess,  # FastHTML session object
) -> FT:  # AsyncLoadingContainer component
    """
    Render the workflow entry point for embedding in tabs, etc.
    
    Returns an AsyncLoadingContainer that loads the current_status endpoint,
    which determines what to show (running job, workflow in progress,
    completed job, or fresh start).
    """
@patch
def _on_job_completed(
    "Workflow-specific completion handling. Auto-saves results if enabled."
@patch
def _create_preview_route_func(
    self: SingleFileTranscriptionWorkflow,
):  # Function that generates preview route URLs
    "Create a function that generates preview route URLs (with optional file_type)."
@patch
def _create_preview_url_func(
    self: SingleFileTranscriptionWorkflow,
):  # Function that generates preview URLs for file selection
    "Create a function that generates preview URLs for file selection (index only)."
@patch
def _create_step_flow(
    self: SingleFileTranscriptionWorkflow,
) -> StepFlow:  # Configured StepFlow instance
    "Create and configure the StepFlow instance."
@patch
def _create_router(
    self: SingleFileTranscriptionWorkflow,
) -> APIRouter:  # Configured APIRouter with all workflow routes
    "Create the workflow's API router with all routes."

Classes

class SingleFileTranscriptionWorkflow:
    def __init__(
        self,
        plugin_manager: PluginManager,  # PluginManager from host application
        config: Optional[SingleFileWorkflowConfig] = None,  # Explicit config (bypasses auto-loading)
        **config_overrides  # Override specific config values
    )
    """
    Self-contained single-file transcription workflow.
    
    Receives a PluginManager from the host application and creates internal
    TranscriptionJobTracker, SSEBroadcastManager, FileBrowser, ResultStorage,
    StepFlow (plugin → file → confirm wizard), and APIRouter.
    """
    
    def __init__(
            self,
            plugin_manager: PluginManager,  # PluginManager from host application
            config: Optional[SingleFileWorkflowConfig] = None,  # Explicit config (bypasses auto-loading)
            **config_overrides  # Override specific config values
        )
        "Initialize the workflow with injected PluginManager."
    
    def create_and_setup(
            cls,
            app,  # FastHTML application instance
            plugin_manager: PluginManager,  # PluginManager from host application
            config: Optional[SingleFileWorkflowConfig] = None,  # Explicit config (bypasses auto-loading)
            **config_overrides  # Override specific config values
        ) -> "SingleFileTranscriptionWorkflow":  # Configured and setup workflow instance
        "Create, configure, and setup a workflow in one call."
    
    def job_tracker(self) -> TranscriptionJobTracker:
            """Access to internal job tracker."""
            return self._job_tracker
        
        @property
        def plugin_manager(self) -> PluginManager
        "Access to internal job tracker."
    
    def plugin_manager(self) -> PluginManager:
            """Access to plugin manager."""
            return self._plugin_manager
        
        @property
        def plugin_registry(self) -> PluginRegistryAdapter
        "Access to plugin manager."
    
    def plugin_registry(self) -> PluginRegistryAdapter:
            """Access to plugin registry adapter."""
            return self._plugin_adapter
        
        @property
        def file_browser(self) -> FileBrowser
        "Access to plugin registry adapter."
    
    def file_browser(self) -> FileBrowser:
            """Access to internal file browser."""
            return self._file_browser
        
        @property
        def result_storage(self) -> ResultStorage
        "Access to internal file browser."
    
    def result_storage(self) -> ResultStorage:
            """Access to internal result storage."""
            return self._result_storage
        
        @property
        def router(self) -> APIRouter
        "Access to internal result storage."
    
    def router(self) -> APIRouter:
            """Main workflow router."""
            return self._router
        
        @property
        def stepflow_router(self) -> APIRouter
        "Main workflow router."
    
    def stepflow_router(self) -> APIRouter
        "StepFlow-generated router."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cjm_fasthtml_workflow_transcription_single_file-0.0.21.tar.gz.

File metadata

File hashes

Hashes for cjm_fasthtml_workflow_transcription_single_file-0.0.21.tar.gz
Algorithm Hash digest
SHA256 53efdfccdf4cefcefbe9c104039be2b9c7e19588318f17a3f46166cdcdae74ac
MD5 615cd58f5bff84d5a467267bf3a6ef5d
BLAKE2b-256 3e4e30387bac3c29c61481102ff4f89e19c9d87c62b41699d1936143dd91147f

See more details on using hashes here.

File details

Details for the file cjm_fasthtml_workflow_transcription_single_file-0.0.21-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_fasthtml_workflow_transcription_single_file-0.0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 b2e7e8e174d295f7f3ba763226ec20b40f196287f8a96d8b83c8c688fa8e8bee
MD5 3ae09299e80ef7178efb1b3b84bccbb3
BLAKE2b-256 870cd808511cca73e4e19490a92203915627d9a43004190be372fbfbd802cb08

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page