Skip to main content

A self-contained single-file transcription workflow for FastHTML applications.

Project description

cjm-fasthtml-workflow-transcription-single-file

Install

pip install cjm_fasthtml_workflow_transcription_single_file

Project Structure

nbs/
├── components/ (3)
│   ├── processor.ipynb  # UI component for displaying transcription in-progress state
│   ├── results.ipynb    # UI components for displaying transcription results and errors
│   └── steps.ipynb      # UI components for workflow step rendering (plugin selection, file selection, confirmation)
├── core/ (5)
│   ├── adapters.ipynb   # Adapter implementations for integrating with plugin registries
│   ├── config.ipynb     # Configuration dataclass for single-file transcription workflow
│   ├── html_ids.ipynb   # Centralized HTML ID constants for single-file transcription workflow components
│   ├── protocols.ipynb  # Protocol definitions for external dependencies and plugin integration
│   └── registry.ipynb   # Unified plugin registry for managing multiple domain-specific plugin systems with configuration persistence
├── settings/ (2)
│   ├── components.ipynb  # UI components for workflow settings modal and forms
│   └── schemas.ipynb     # JSON schemas and utilities for workflow settings
├── storage/ (3)
│   ├── config.ipynb        # Configuration for transcription result storage
│   ├── file_storage.ipynb  # File-based storage for transcription results
│   └── protocols.ipynb     # Protocol definitions for result storage backends
└── workflow/ (3)
    ├── job_handler.ipynb  # Functions for starting transcription jobs and handling SSE streaming
    ├── routes.ipynb       # Route initialization and handlers for the single-file transcription workflow
    └── workflow.ipynb     # Main workflow class orchestrating all subsystems for single-file transcription

Total: 16 notebooks across 5 directories

Module Dependencies

graph LR
    components_processor[components.processor<br/>Processor Component]
    components_results[components.results<br/>Results Components]
    components_steps[components.steps<br/>Step Components]
    core_adapters[core.adapters<br/>Adapters]
    core_config[core.config<br/>Configuration]
    core_html_ids[core.html_ids<br/>HTML IDs]
    core_protocols[core.protocols<br/>Protocols]
    core_registry[core.registry<br/>Registry]
    settings_components[settings.components<br/>Settings Components]
    settings_schemas[settings.schemas<br/>Settings Schemas]
    storage_config[storage.config<br/>Storage Configuration]
    storage_file_storage[storage.file_storage<br/>Result Storage]
    storage_protocols[storage.protocols<br/>Storage Protocols]
    workflow_job_handler[workflow.job_handler<br/>Job Handler]
    workflow_routes[workflow.routes<br/>Workflow Routes]
    workflow_workflow[workflow.workflow<br/>Single File Transcription Workflow]

    components_processor --> core_config
    components_processor --> core_html_ids
    components_results --> core_config
    components_results --> core_html_ids
    components_steps --> core_config
    components_steps --> core_protocols
    components_steps --> core_html_ids
    core_adapters --> core_protocols
    core_adapters --> core_registry
    core_config --> storage_config
    core_config --> core_html_ids
    settings_schemas --> storage_config
    settings_schemas --> core_config
    storage_file_storage --> storage_config
    workflow_job_handler --> core_config
    workflow_job_handler --> workflow_workflow
    workflow_job_handler --> components_results
    workflow_job_handler --> components_processor
    workflow_job_handler --> core_protocols
    workflow_job_handler --> core_html_ids
    workflow_job_handler --> storage_file_storage
    workflow_routes --> workflow_workflow
    workflow_routes --> components_results
    workflow_routes --> components_processor
    workflow_routes --> components_steps
    workflow_routes --> workflow_job_handler
    workflow_routes --> core_html_ids
    workflow_workflow --> storage_file_storage
    workflow_workflow --> components_steps
    workflow_workflow --> core_config
    workflow_workflow --> core_adapters
    workflow_workflow --> core_registry
    workflow_workflow --> core_html_ids

33 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Adapters (adapters.ipynb)

Adapter implementations for integrating with plugin registries

Import

from cjm_fasthtml_workflow_transcription_single_file.core.adapters import (
    PluginRegistryAdapter,
    DefaultConfigPluginRegistryAdapter
)

Classes

class PluginRegistryAdapter:
    def __init__(self,
    "Adapts app's UnifiedPluginRegistry to workflow's PluginRegistryProtocol."
    
    def __init__(self,
        "Initialize the adapter."
    
    def get_configured_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for configured plugins
            """Get all configured transcription plugins (those with saved config files)."""
            plugins = self._registry.get_plugins_by_category(self._category)
            return [
                PluginInfo(
                    id=p.get_unique_id(),
                    name=p.name,
                    title=p.title,
                    is_configured=p.is_configured,
                    supports_streaming=self._check_streaming_support(p)
                )
                for p in plugins if p.is_configured
            ]
    
        def get_all_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for all discovered plugins
        "Get all configured transcription plugins (those with saved config files)."
    
    def get_all_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for all discovered plugins
            """Get all discovered transcription plugins (configured or not)."""
            plugins = self._registry.get_plugins_by_category(self._category)
            return [
                PluginInfo(
                    id=p.get_unique_id(),
                    name=p.name,
                    title=p.title,
                    is_configured=p.is_configured,
                    supports_streaming=self._check_streaming_support(p)
                )
                for p in plugins
            ]
    
        def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get all discovered transcription plugins (configured or not)."
    
    def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get a specific plugin by ID."
    
    def get_plugin_config(self,
                              plugin_id: str  # Unique plugin identifier
                              ) -> Dict[str, Any]:  # Configuration dictionary, empty dict if not configured
        "Get the configuration for a plugin."
class DefaultConfigPluginRegistryAdapter:
    def __init__(self,
                 registry: UnifiedPluginRegistry,  # The UnifiedPluginRegistry instance to wrap
                 category: str = "transcription"  # Plugin category to filter by
                 )
    "Plugin registry adapter that provides default config values for unconfigured plugins."
    
    def __init__(self,
                     registry: UnifiedPluginRegistry,  # The UnifiedPluginRegistry instance to wrap
                     category: str = "transcription"  # Plugin category to filter by
                     )
        "Initialize adapter with registry instance."
    
    def get_plugins_by_category(self,
                                    category: str  # Plugin category to filter by
                                    ) -> list:  # List of plugins in the category
        "Get all plugins in a specific category."
    
    def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ):  # Plugin metadata or None
        "Get a specific plugin by ID."
    
    def load_plugin_config(self,
                               plugin_id: str  # Unique plugin identifier
                               ) -> Dict[str, Any]:  # Configuration dictionary with defaults applied
        "Load configuration for a plugin, using defaults if no saved config exists."

Settings Components (components.ipynb)

UI components for workflow settings modal and forms

Import

from cjm_fasthtml_workflow_transcription_single_file.settings.components import (
    settings_trigger_button,
    simple_settings_form,
    settings_modal
)

Functions

def settings_trigger_button(
    modal_id:str,                  # ID of the modal to trigger
    label:str="Settings",          # Button label text
    button_cls:Optional[str]=None  # Optional additional button classes
) -> FT:                           # Button element that triggers the modal
    "Create a button that opens the settings modal."
def simple_settings_form(
    directories:list,        # List of media directories
    auto_save:bool,          # Current auto-save setting
    results_directory:str,   # Current results directory
    save_url:str,            # URL to POST settings to
    target_id:str,           # Target element ID for HTMX response
    modal_id:str             # Modal ID for close button
) -> FT:                     # Simple form element
    "Create a simple settings form without full schema generation."
def settings_modal(
    modal_id:str,                    # ID for the modal element
    schema:Dict[str, Any],           # JSON schema for settings
    current_values:Dict[str, Any],   # Current settings values
    save_url:str,                    # URL to POST settings to
    target_id:str                    # Target element ID for HTMX response
) -> FT:                             # Modal dialog with settings form
    "Create the settings modal with form."

Configuration (config.ipynb)

Configuration dataclass for single-file transcription workflow

Import

from cjm_fasthtml_workflow_transcription_single_file.core.config import (
    DEFAULT_WORKFLOW_CONFIG_DIR,
    SingleFileWorkflowConfig
)

Classes

@dataclass
class SingleFileWorkflowConfig:
    "Configuration for single-file transcription workflow."
    
    workflow_id: str = 'single_file_transcription'  # Unique identifier for this workflow
    worker_type: str = 'transcription:single_file'  # Worker process type identifier
    route_prefix: str = '/single_file'  # Base URL prefix for workflow routes
    stepflow_prefix: str = '/flow'  # Sub-prefix for StepFlow routes
    media_prefix: str = '/media'  # Sub-prefix for media browser routes
    container_id: str = SingleFileHtmlIds.WORKFLOW_CONTAINER  # HTML ID for main workflow container
    show_progress: bool = True  # Show step progress indicator
    max_files_displayed: int = 50  # Maximum files to show in simple file selector
    export_formats: List[str] = field(...)  # Available export formats
    no_plugins_redirect: Optional[str]  # URL to redirect when no plugins configured
    no_files_redirect: Optional[str]  # URL to redirect when no media files found
    sse_poll_interval: float = 2.0  # Seconds between SSE status checks
    gpu_memory_threshold_percent: float = 45.0  # Max GPU memory % before blocking new jobs
    config_dir: Path = field(...)  # Directory for workflow settings
    plugin_config_dir: Path = field(...)  # Directory for plugin configs
    plugin_category: str = 'transcription'  # Plugin category for this workflow
    media: BrowserConfig = field(...)  # File browser settings for media files
    storage: StorageConfig = field(...)  # Result storage settings
    
    def get_full_stepflow_prefix(self) -> str:  # Combined route_prefix + stepflow_prefix
            """Get the full prefix for the StepFlow router."""
            return f"{self.route_prefix}{self.stepflow_prefix}"
    
        def get_full_media_prefix(self) -> str:  # Combined route_prefix + media_prefix
        "Get the full prefix for the StepFlow router."
    
    def get_full_media_prefix(self) -> str:  # Combined route_prefix + media_prefix
            """Get the full prefix for the media router."""
            return f"{self.route_prefix}{self.media_prefix}"
    
        @classmethod
        def from_saved_config(
            cls,
            config_dir: Optional[Path] = None,  # Directory to load config from
            **overrides  # Override specific config values
        ) -> "SingleFileWorkflowConfig":  # Configured instance with saved values merged with defaults
        "Get the full prefix for the media router."
    
    def from_saved_config(
            cls,
            config_dir: Optional[Path] = None,  # Directory to load config from
            **overrides  # Override specific config values
        ) -> "SingleFileWorkflowConfig":  # Configured instance with saved values merged with defaults
        "Create config by loading saved settings and merging with defaults."

Variables

DEFAULT_WORKFLOW_CONFIG_DIR

Storage Configuration (config.ipynb)

Configuration for transcription result storage

Import

from cjm_fasthtml_workflow_transcription_single_file.storage.config import (
    STORAGE_CONFIG_SCHEMA,
    StorageConfig
)

Classes

@dataclass
class StorageConfig:
    "Result storage configuration."
    
    __schema_name__: ClassVar[str] = 'storage'
    __schema_title__: ClassVar[str] = 'Storage Settings'
    __schema_description__: ClassVar[str] = 'Configure transcription result storage'
    auto_save: bool = field(...)
    results_directory: str = field(...)

Variables

STORAGE_CONFIG_SCHEMA

Result Storage (file_storage.ipynb)

File-based storage for transcription results

Import

from cjm_fasthtml_workflow_transcription_single_file.storage.file_storage import (
    ResultStorage
)

Functions

@patch
def should_auto_save(
    self: ResultStorage
) -> bool:  # True if results should be automatically saved
    "Check if auto-save is enabled."
@patch
def save(
    self: ResultStorage,
    job_id: str,  # Unique job identifier
    file_path: str,  # Path to the transcribed media file
    file_name: str,  # Name of the media file
    plugin_id: str,  # Plugin unique identifier
    plugin_name: str,  # Plugin display name
    text: str,  # The transcription text
    metadata: Optional[Dict[str, Any]] = None,  # Optional metadata from the transcription plugin
    additional_info: Optional[Dict[str, Any]] = None  # Optional additional information to store
) -> Path:  # Path to the saved JSON file
    "Save a transcription result to JSON file."
@patch
def load(
    self: ResultStorage,
    result_file: Path  # Path to the JSON result file
) -> Optional[Dict[str, Any]]:  # Dictionary containing the result data, or None if error
    "Load a transcription result from JSON file."
@patch
def list_results(
    self: ResultStorage,
    sort_by: str = "timestamp",  # Field to sort by ("timestamp", "file_name", "word_count")
    reverse: bool = True  # Sort in reverse order (newest first by default)
) -> List[Dict[str, Any]]:  # List of result dictionaries
    "List all saved transcription results."
@patch
def get_by_job_id(
    self: ResultStorage,
    job_id: str  # The job identifier to search for
) -> Optional[Dict[str, Any]]:  # Result dictionary if found, None otherwise
    "Find and load a transcription result by job ID."
@patch
def delete(
    self: ResultStorage,
    result_file: str  # Path to the result file (can be full path or filename)
) -> bool:  # True if deletion successful, False otherwise
    "Delete a transcription result file."
@patch
def update_text(
    self: ResultStorage,
    result_file: str,  # Path to the result file
    new_text: str  # New transcription text
) -> bool:  # True if update successful, False otherwise
    "Update the transcription text in a saved result."
@patch
def _generate_filename(
    self: ResultStorage,
    job_id: str,  # Unique job identifier
    file_name: str  # Original media file name
) -> str:  # Generated filename for the JSON result file
    "Generate a filename for storing transcription results."

Classes

class ResultStorage:
    def __init__(self,
                 config: StorageConfig  # Storage configuration
                 )
    "File-based storage for transcription results."
    
    def __init__(self,
                     config: StorageConfig  # Storage configuration
                     )
        "Initialize the storage."
    
    def results_directory(self) -> Path:  # Path to the results directory
            """Get the results directory, creating it if needed."""
            if self._results_dir is None
        "Get the results directory, creating it if needed."

HTML IDs (html_ids.ipynb)

Centralized HTML ID constants for single-file transcription workflow components

Import

from cjm_fasthtml_workflow_transcription_single_file.core.html_ids import (
    SingleFileHtmlIds
)

Classes

class SingleFileHtmlIds(InteractionHtmlIds):
    "HTML ID constants for single-file transcription workflow."
    
    def plugin_radio(plugin_id: str  # Unique plugin identifier to generate ID for
                         ) -> str:  # HTML ID for the plugin radio button
        "Generate HTML ID for a plugin radio button."
    
    def file_radio(index: int  # File index in the selection list
                       ) -> str:  # HTML ID for the file radio button
        "Generate HTML ID for a file radio button."

Job Handler (job_handler.ipynb)

Functions for starting transcription jobs and handling SSE streaming

Import

from cjm_fasthtml_workflow_transcription_single_file.workflow.job_handler import (
    get_job_session_info,
    start_transcription_job,
    create_job_stream_handler
)

Functions

def get_job_session_info(
    job_id: str,  # Unique job identifier
    job,  # Job object from the manager
    plugin_registry: PluginRegistryProtocol,  # Plugin registry for getting plugin info
) -> tuple[Dict[str, Any], Dict[str, Any]]:  # Tuple of (file_info, plugin_info) dictionaries
    "Get file and plugin info from job object and plugin registry."
def _save_job_result_once(
    job_id: str,  # Job identifier
    job,  # Job object
    data: Dict[str, Any],  # Transcription data containing text and metadata
    plugin_registry: PluginRegistryProtocol,  # Plugin registry for getting plugin info
    result_storage: ResultStorage,  # Storage for saving transcription results
) -> None
    """
    Save transcription result to disk, ensuring it's only saved once per job.
    
    Called from the SSE stream handler as a fallback. The primary save mechanism
    is the workflow's `_on_job_completed` callback called by TranscriptionJobManager.
    """
def _create_sse_swap_message(
    content,  # HTML content to wrap
    container_id: str,  # Target container ID for the swap
):  # Div with OOB swap attributes
    "Wrap content in a Div with HTMX OOB swap for SSE messages."
async def start_transcription_job(
    state: Dict[str, Any],  # Workflow state containing plugin_id, file_path, file_name, etc.
    request,  # FastHTML request object
    workflow: SingleFileTranscriptionWorkflow,  # Workflow instance providing config and dependencies
):  # transcription_in_progress component showing job status
    "Start a transcription job and return the in-progress UI component."
def create_job_stream_handler(
    job_id: str,  # Unique job identifier
    request,  # FastHTML request object
    workflow: SingleFileTranscriptionWorkflow,  # Workflow instance providing config and dependencies
):  # Async generator for SSE streaming
    "Create an SSE stream generator for monitoring job completion."

Processor Component (processor.ipynb)

UI component for displaying transcription in-progress state

Import

from cjm_fasthtml_workflow_transcription_single_file.components.processor import (
    transcription_in_progress
)

Functions

def transcription_in_progress(
    job_id: str, # Unique identifier for the transcription job
    plugin_info: Dict[str, Any], # Dictionary with plugin details (id, title, supports_streaming)
    file_info: Dict[str, Any], # Dictionary with file details (name, path, type, size_str)
    config: SingleFileWorkflowConfig, # Workflow configuration
    router: APIRouter, # Workflow router for generating route URLs
) -> FT: # FastHTML component showing progress and SSE connection
    "Render transcription in-progress view with SSE updates."

Protocols (protocols.ipynb)

Protocol definitions for external dependencies and plugin integration

Import

from cjm_fasthtml_workflow_transcription_single_file.core.protocols import (
    PluginInfo,
    PluginRegistryProtocol
)

Classes

@dataclass
class PluginInfo:
    "Information about a transcription plugin."
    
    id: str  # Unique plugin identifier (e.g., "transcription:voxtral_hf")
    name: str  # Plugin name (e.g., "voxtral_hf")
    title: str  # Display title (e.g., "Voxtral HF")
    is_configured: bool  # Whether the plugin has a valid configuration
    supports_streaming: bool = False  # Whether the plugin supports streaming output
@runtime_checkable
class PluginRegistryProtocol(Protocol):
    "Protocol for plugin registry access."
    
    def get_configured_plugins(self) -> List[PluginInfo]:  # List of PluginInfo for configured plugins
            """Get all configured transcription plugins."""
            ...
    
        def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get all configured transcription plugins."
    
    def get_plugin(self,
                       plugin_id: str  # Unique plugin identifier
                       ) -> Optional[PluginInfo]:  # PluginInfo if found, None otherwise
        "Get a specific plugin by ID."
    
    def get_plugin_config(self,
                              plugin_id: str  # Unique plugin identifier
                              ) -> Dict[str, Any]:  # Configuration dictionary, empty dict if not configured
        "Get the configuration for a plugin."

Storage Protocols (protocols.ipynb)

Protocol definitions for result storage backends

Import

from cjm_fasthtml_workflow_transcription_single_file.storage.protocols import (
    ResultStorageProtocol
)

Classes

@runtime_checkable
class ResultStorageProtocol(Protocol):
    "Protocol for transcription result storage backends."
    
    def should_auto_save(self) -> bool:  # True if results should be automatically saved
            """Check if auto-save is enabled."""
            ...
    
        def save(
            self,
            job_id: str,  # Unique job identifier
            file_path: str,  # Path to the transcribed media file
            file_name: str,  # Name of the media file
            plugin_id: str,  # Plugin unique identifier
            plugin_name: str,  # Plugin display name
            text: str,  # The transcription text
            metadata: Optional[Dict[str, Any]] = None,  # Optional metadata from the transcription plugin
            additional_info: Optional[Dict[str, Any]] = None  # Optional additional information to store
        ) -> Any:  # Implementation-specific return value (e.g., Path for file storage, ID for database)
        "Check if auto-save is enabled."
    
    def save(
            self,
            job_id: str,  # Unique job identifier
            file_path: str,  # Path to the transcribed media file
            file_name: str,  # Name of the media file
            plugin_id: str,  # Plugin unique identifier
            plugin_name: str,  # Plugin display name
            text: str,  # The transcription text
            metadata: Optional[Dict[str, Any]] = None,  # Optional metadata from the transcription plugin
            additional_info: Optional[Dict[str, Any]] = None  # Optional additional information to store
        ) -> Any:  # Implementation-specific return value (e.g., Path for file storage, ID for database)
        "Save a transcription result."
    
    def load(
            self,
            result_id: Any  # Implementation-specific identifier (e.g., Path for file storage, ID for database)
        ) -> Optional[Dict[str, Any]]:  # Result dictionary or None if not found
        "Load a transcription result by its identifier."
    
    def list_results(
            self,
            sort_by: str = "timestamp",  # Field to sort by
            reverse: bool = True  # Sort in reverse order
        ) -> List[Dict[str, Any]]:  # List of result dictionaries
        "List all saved transcription results."
    
    def get_by_job_id(
            self,
            job_id: str  # The job identifier to search for
        ) -> Optional[Dict[str, Any]]:  # Result dictionary if found, None otherwise
        "Find and load a transcription result by job ID."
    
    def delete(
            self,
            result_id: Any  # Implementation-specific identifier
        ) -> bool:  # True if deletion successful, False otherwise
        "Delete a transcription result."

Registry (registry.ipynb)

Unified plugin registry for managing multiple domain-specific plugin systems with configuration persistence

Import

from cjm_fasthtml_workflow_transcription_single_file.core.registry import (
    T,
    PluginMetadata,
    UnifiedPluginRegistry
)

Classes

@dataclass
class PluginMetadata:
    "Metadata describing a plugin for display and configuration management."
    
    name: str  # Internal plugin identifier
    category: str  # Plugin category string (application-defined)
    title: str  # Display title for the plugin
    config_schema: Dict[str, Any]  # JSON Schema for plugin configuration
    config_dataclass: Optional[Type]  # Configuration dataclass type (if available)
    description: Optional[str]  # Plugin description
    version: Optional[str]  # Plugin version
    is_configured: bool = False  # Whether the plugin has saved configuration
    
    def get_unique_id(self) -> str:  # String in format 'category_name'
        "Generate unique ID for this plugin."
class UnifiedPluginRegistry:
    def __init__(self, 
                 config_dir: Optional[Path] = None  # Directory for plugin configuration files (default: 'configs')
                )
    "Unified registry for multiple domain-specific plugin systems with configuration persistence."
    
    def __init__(self,
                     config_dir: Optional[Path] = None  # Directory for plugin configuration files (default: 'configs')
                    )
        "Initialize the unified plugin registry."
    
    def register_plugin_manager(
            self,
            category: str,  # Category name (e.g., "transcription", "llm")
            manager: Any,  # Domain-specific plugin manager
            display_name: Optional[str] = None,  # Display name for UI
            auto_discover: bool = True  # Automatically discover plugins?
        ) -> List[PluginMetadata]:  # List of discovered plugin metadata
        "Register a domain-specific plugin manager."
    
    def register_plugin_system(
            self,
            category: str,  # Category name (e.g., "transcription", "llm")
            plugin_interface: Type,  # Plugin interface class (e.g., TranscriptionPlugin)
            display_name: Optional[str] = None,  # Display name for UI
            auto_discover: bool = True  # Automatically discover plugins?
        ) -> List[PluginMetadata]:  # List of discovered plugin metadata
        "Create and register a plugin system in one step. This is a convenience method that creates a PluginManager with the
specified interface and registers it with the registry."
    
    def get_manager(
            self,
            category: str,  # Category name
            manager_type: Optional[Type[T]] = None  # Optional type hint for IDE autocomplete
        ) -> Optional[T]:  # Plugin manager instance
        "Get plugin manager for a specific category."
    
    def get_categories(self) -> List[str]:  # Sorted list of category names
            """Get all registered categories."""
            return sorted(self._categories.keys())
        
        def get_category_display_name(self, 
                                       category: str  # Category name
                                      ) -> str:  # Display name or category name if not set
        "Get all registered categories."
    
    def get_category_display_name(self,
                                       category: str  # Category name
                                      ) -> str:  # Display name or category name if not set
        "Get display name for a category."
    
    def get_plugin(self,
                       unique_id: str  # Plugin unique identifier (format: 'category_name')
                      ) -> Optional[PluginMetadata]:  # Plugin metadata if found, None otherwise
        "Get plugin metadata by unique ID."
    
    def get_plugins_by_category(self,
                                    category: str  # Category name
                                   ) -> List[PluginMetadata]:  # List of plugin metadata for the category
        "Get all plugins in a category."
    
    def get_all_plugins(self) -> List[PluginMetadata]:  # List of all plugin metadata
            """Get all plugins across all categories."""
            return list(self._plugins.values())
        
        def get_categories_with_plugins(self) -> List[str]:  # Sorted list of categories with plugins
        "Get all plugins across all categories."
    
    def get_categories_with_plugins(self) -> List[str]:  # Sorted list of categories with plugins
            """Get categories that have registered plugins."""
            categories = set(p.category for p in self._plugins.values())
            return sorted(categories)
        
        def load_plugin_config(self, 
                              unique_id: str  # Plugin unique identifier
                             ) -> Dict[str, Any]:  # Configuration dictionary (empty if no config exists)
        "Get categories that have registered plugins."
    
    def load_plugin_config(self,
                              unique_id: str  # Plugin unique identifier
                             ) -> Dict[str, Any]:  # Configuration dictionary (empty if no config exists)
        "Load saved configuration for a plugin."
    
    def save_plugin_config(self,
                              unique_id: str,  # Plugin unique identifier
                              config: Dict[str, Any]  # Configuration dictionary to save
                             ) -> bool:  # True if save succeeded, False otherwise
        "Save configuration for a plugin."
    
    def delete_plugin_config(self,
                                unique_id: str  # Plugin unique identifier
                               ) -> bool:  # True if deletion succeeded, False otherwise
        "Delete saved configuration for a plugin."

Variables

T

Results Components (results.ipynb)

UI components for displaying transcription results and errors

Import

from cjm_fasthtml_workflow_transcription_single_file.components.results import (
    transcription_results,
    transcription_error
)

Functions

def transcription_results(
    job_id: str, # Unique identifier for the transcription job
    transcription_text: str, # The transcribed text
    metadata: Dict[str, Any], # Transcription metadata from the plugin
    file_info: Dict[str, Any], # Dictionary with file details (name, path, type, size_str)
    plugin_info: Dict[str, Any], # Dictionary with plugin details (id, title, supports_streaming)
    config: SingleFileWorkflowConfig, # Workflow configuration
    router: APIRouter, # Workflow router for generating route URLs
    stepflow_router: APIRouter, # StepFlow router for generating stepflow URLs
) -> FT: # FastHTML component showing results with export options
    "Render transcription results with export options."
def transcription_error(
    error_message: str, # Description of the error that occurred
    file_info: Optional[Dict[str, Any]], # Optional dictionary with file details
    config: SingleFileWorkflowConfig, # Workflow configuration
    stepflow_router: APIRouter, # StepFlow router for generating stepflow URLs
) -> FT: # FastHTML component showing error with retry option
    "Render transcription error message."

Workflow Routes (routes.ipynb)

Route initialization and handlers for the single-file transcription workflow

Import

from cjm_fasthtml_workflow_transcription_single_file.workflow.routes import (
    init_router
)

Functions

def _handle_current_status(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
):  # Appropriate UI component based on current state
    "Return current transcription status - determines what to show."
async def _handle_cancel_job(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
    job_id: str,  # ID of the job to cancel
):  # StepFlow start view or error component
    "Cancel a running transcription job."
def _handle_reset(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
):  # StepFlow start view
    "Reset transcription workflow and return to start."
def _handle_stream_job(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    sess,  # FastHTML session object
    job_id: str,  # ID of the job to monitor
):  # EventStream for SSE updates
    "SSE endpoint for monitoring job completion."
def _handle_export(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    job_id: str,  # ID of the job to export
    format: str = "txt",  # Export format (txt, srt, vtt)
):  # Response with file download
    "Export transcription in specified format."
def _handle_plugin_details(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    plugin_id: str,  # ID of the plugin to show details for
    save_url: str,  # URL for saving plugin configuration
    reset_url: str,  # URL for resetting plugin configuration
):  # Plugin details component or empty Div
    "Get plugin details for display in workflow."
async def _handle_save_plugin_config(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    plugin_id: str,  # ID of the plugin to save config for
    save_url: str,  # URL for saving plugin configuration
    reset_url: str,  # URL for resetting plugin configuration
):  # Updated config form or error alert
    "Save plugin configuration from the collapse form."
def _handle_reset_plugin_config(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    plugin_id: str,  # ID of the plugin to reset config for
    save_url: str,  # URL for saving plugin configuration
    reset_url: str,  # URL for resetting plugin configuration
):  # Updated config form with defaults or empty Div
    "Reset plugin configuration to defaults."
def _handle_media_preview(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    idx: int = 0,  # Index of the file to preview
    file_type: str = None,  # Optional filter by file type
):  # File preview modal or error Div
    "Render file preview modal for a specific file."
def _handle_refresh_media(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
):  # JSON status response
    "Refresh file browser cache."
def _handle_settings_modal(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
    save_url: str,  # URL for saving settings
):  # Settings modal component
    "Render the settings modal for the workflow."
async def _handle_settings_save(
    workflow: "SingleFileTranscriptionWorkflow",  # The workflow instance
    request,  # FastHTML request object
):  # Success alert with modal close script or error alert
    "Save workflow settings."
def init_router(
    workflow: SingleFileTranscriptionWorkflow,  # The workflow instance providing access to config and dependencies
) -> APIRouter:  # Configured APIRouter with all workflow routes
    "Initialize and return the workflow's API router with all routes."
def _export_transcription(
    text: str,  # Transcription text
    format: str,  # Export format (txt, srt, vtt)
    filename: str,  # Original filename for metadata
) -> str:  # Formatted transcription string
    "Format transcription for export."

Settings Schemas (schemas.ipynb)

JSON schemas and utilities for workflow settings

Import

from cjm_fasthtml_workflow_transcription_single_file.settings.schemas import (
    WORKFLOW_SETTINGS_SCHEMA,
    WorkflowSettings
)

Functions

def from_configs(
    cls: WorkflowSettings,
    browser_config: BrowserConfig,  # BrowserConfig instance with file browser settings
    storage_config: StorageConfig,  # StorageConfig instance with result storage settings
    workflow_config: Optional[SingleFileWorkflowConfig] = None  # Optional workflow config for additional settings
) -> "WorkflowSettings":  # WorkflowSettings instance with values from configs
    "Create WorkflowSettings from runtime config objects."
@patch
def apply_to_configs(
    self: WorkflowSettings,
    browser_config: BrowserConfig,  # BrowserConfig instance to update
    storage_config: StorageConfig,  # StorageConfig instance to update
    workflow_config: Optional[SingleFileWorkflowConfig] = None  # Optional workflow config to update
) -> None
    "Apply settings to runtime config objects."
@patch
def to_dict(
    self: WorkflowSettings
) -> Dict[str, Any]:  # Dictionary of settings values
    "Convert settings to a dictionary for serialization."

Classes

@dataclass
class WorkflowSettings:
    "User-configurable settings for single-file transcription workflow."
    
    __schema_name__: ClassVar[str] = 'single_file_workflow'
    __schema_title__: ClassVar[str] = 'Single File Transcription Settings'
    __schema_description__: ClassVar[str] = 'Configure file scanning, storage, and workflow behavior'
    directories: List[str] = field(...)
    enabled_types: List[str] = field(...)
    recursive_scan: bool = field(...)
    items_per_page: int = field(...)
    default_view: str = field(...)
    auto_save: bool = field(...)
    results_directory: str = field(...)
    gpu_memory_threshold_percent: float = field(...)

Variables

WORKFLOW_SETTINGS_SCHEMA  # Auto-generate schema from WorkflowSettings dataclass

Step Components (steps.ipynb)

UI components for workflow step rendering (plugin selection, file selection, confirmation)

Import

from cjm_fasthtml_workflow_transcription_single_file.components.steps import (
    render_plugin_config_form,
    render_plugin_details_route,
    render_plugin_selection,
    render_file_selection,
    render_confirmation
)

Functions

def _get_file_attr(
    file_path: str,  # Path to the file to look up
    files: list,  # List of file info objects to search
    attr: str,  # Attribute name to retrieve from the file
) -> str:  # Attribute value or empty string if not found
    "Get an attribute from a file by path."
def _render_plugin_details_content(
    plugin_id: str, # ID of the plugin to display details for
    plugins: List[PluginInfo], # List of available plugins
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin config
) -> Optional[FT]: # Plugin info card or None if plugin not found
    "Render details for selected plugin (info card only, no config collapse)."
def _render_plugin_details_with_config(
    plugin_id: str, # ID of the plugin to display details for
    plugins: List[PluginInfo], # List of available plugins
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin config
    raw_plugin_registry, # UnifiedPluginRegistry for config_schema access
    save_url: str, # URL for saving plugin configuration
    reset_url: str, # URL for resetting plugin configuration
) -> Optional[FT]: # Plugin details with config collapse, or None if not found
    "Render plugin details with configuration collapse for initial render."
def render_plugin_config_form(
    plugin_id: str, # ID of the plugin to render config for
    plugin_registry, # UnifiedPluginRegistry with config_class access
    save_url: str, # URL for saving the configuration
    reset_url: str, # URL for resetting to defaults
    alert_message: Optional[Any] = None, # Optional alert to display above the form
) -> FT: # Div containing the settings form with alert container
    "Render the plugin configuration form for the collapse content."
def _render_plugin_config_collapse(
    plugin_id: str, # ID of the plugin to render config for
    plugin_registry, # UnifiedPluginRegistry with config_schema access
    save_url: str, # URL for saving the configuration
    reset_url: str, # URL for resetting to defaults
) -> FT: # Collapse component with plugin configuration form
    "Render a collapse component containing the plugin configuration form."
def render_plugin_details_route(
    plugin_id: str, # ID of the plugin to display details for
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugins and config
    raw_plugin_registry, # UnifiedPluginRegistry for config_schema access
    save_url: str, # URL for saving plugin configuration
    reset_url: str, # URL for resetting plugin configuration to defaults
) -> FT: # Plugin details with info card and config collapse
    "Render plugin details for HTMX route when plugin dropdown changes."
def render_plugin_selection(
    ctx: InteractionContext, # Interaction context with state and data
    config: SingleFileWorkflowConfig, # Workflow configuration
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin config
    settings_modal_url: str, # URL for the settings modal route
    plugin_details_url: str, # URL for the plugin details route
    raw_plugin_registry=None, # UnifiedPluginRegistry for config_schema access (optional)
    save_plugin_config_url: str = "", # URL for saving plugin configuration
    reset_plugin_config_url: str = "", # URL for resetting plugin configuration
) -> FT: # Plugin selection step UI component
    "Render plugin selection step showing all discovered plugins."
def render_file_selection(
    ctx: InteractionContext,  # Interaction context with state and data
    config: SingleFileWorkflowConfig,  # Workflow configuration
    file_selection_router: APIRouter,  # Router for file selection pagination (or None)
) -> FT:  # File selection step UI component with paginated table
    "Render file selection step with paginated table view and preview capability."
def render_confirmation(
    ctx: InteractionContext, # Interaction context with state and data
    plugin_registry: PluginRegistryProtocol, # Plugin registry adapter for getting plugin info
) -> FT: # Confirmation step UI component showing selected plugin and file
    "Render confirmation step showing selected plugin and file."

Single File Transcription Workflow (workflow.ipynb)

Main workflow class orchestrating all subsystems for single-file transcription

Import

from cjm_fasthtml_workflow_transcription_single_file.workflow.workflow import (
    SingleFileTranscriptionWorkflow
)

Functions

@patch
def setup(
    self: SingleFileTranscriptionWorkflow,
    app,  # FastHTML application instance
) -> None
    "Initialize workflow with FastHTML app. Must be called after app creation."
@patch
def cleanup(
    self: SingleFileTranscriptionWorkflow,
) -> None
    "Clean up workflow resources. Mirrors PluginInterface.cleanup() for future plugin system compatibility."
@patch
def _ensure_plugin_configs_exist(
    self: SingleFileTranscriptionWorkflow,
) -> None
    """
    Ensure all discovered plugins have config files.
    
    For plugins without saved config files, creates a config file with
    default values from the plugin's schema. Required because workers
    only load plugins that have config files.
    """
@patch
def get_routers(
    self: SingleFileTranscriptionWorkflow,
) -> List[APIRouter]:  # List containing main router, stepflow router, media router, and file selection router
    "Return all routers for registration with the app."
@patch
def render_entry_point(
    self: SingleFileTranscriptionWorkflow,
    request,  # FastHTML request object
    sess,  # FastHTML session object
) -> FT:  # AsyncLoadingContainer component
    """
    Render the workflow entry point for embedding in tabs, etc.
    
    Returns an AsyncLoadingContainer that loads the current_status endpoint,
    which determines what to show (running job, workflow in progress,
    completed job, or fresh start).
    """
@patch
def _on_job_completed(
    "Workflow-specific completion handling. Auto-saves results if enabled."
@patch
def _create_preview_route_func(
    self: SingleFileTranscriptionWorkflow,
):  # Function that generates preview route URLs
    "Create a function that generates preview route URLs (with optional file_type)."
@patch
def _create_preview_url_func(
    self: SingleFileTranscriptionWorkflow,
):  # Function that generates preview URLs for file selection
    "Create a function that generates preview URLs for file selection (index only)."
@patch
def _create_step_flow(
    self: SingleFileTranscriptionWorkflow,
) -> StepFlow:  # Configured StepFlow instance
    "Create and configure the StepFlow instance."
@patch
def _create_router(
    self: SingleFileTranscriptionWorkflow,
) -> APIRouter:  # Configured APIRouter with all workflow routes
    "Create the workflow's API router with all routes."

Classes

class SingleFileTranscriptionWorkflow:
    def __init__(
        self,
        config: Optional[SingleFileWorkflowConfig] = None,  # Explicit config (bypasses auto-loading)
        **config_overrides  # Override specific config values
    )
    """
    Self-contained single-file transcription workflow.
    
    Creates and manages internal UnifiedPluginRegistry, ResourceManager,
    TranscriptionJobManager, SSEBroadcastManager, FileBrowser, ResultStorage,
    StepFlow (plugin → file → confirm wizard), and APIRouter.
    """
    
    def __init__(
            self,
            config: Optional[SingleFileWorkflowConfig] = None,  # Explicit config (bypasses auto-loading)
            **config_overrides  # Override specific config values
        )
        "Initialize the workflow with auto-loaded or explicit configuration."
    
    def create_and_setup(
            cls,
            app,  # FastHTML application instance
            config: Optional[SingleFileWorkflowConfig] = None,  # Explicit config (bypasses auto-loading)
            **config_overrides  # Override specific config values
        ) -> "SingleFileTranscriptionWorkflow":  # Configured and setup workflow instance
        "Create, configure, and setup a workflow in one call."
    
    def transcription_manager(self) -> TranscriptionJobManager:
            """Access to internal transcription manager."""
            return self._transcription_manager
        
        @property
        def plugin_registry(self) -> PluginRegistryAdapter
        "Access to internal transcription manager."
    
    def plugin_registry(self) -> PluginRegistryAdapter:
            """Access to plugin registry adapter."""
            return self._plugin_adapter
        
        @property
        def file_browser(self) -> FileBrowser
        "Access to plugin registry adapter."
    
    def file_browser(self) -> FileBrowser:
            """Access to internal file browser."""
            return self._file_browser
        
        @property
        def result_storage(self) -> ResultStorage
        "Access to internal file browser."
    
    def result_storage(self) -> ResultStorage:
            """Access to internal result storage."""
            return self._result_storage
        
        @property
        def router(self) -> APIRouter
        "Access to internal result storage."
    
    def router(self) -> APIRouter:
            """Main workflow router."""
            return self._router
        
        @property
        def stepflow_router(self) -> APIRouter
        "Main workflow router."
    
    def stepflow_router(self) -> APIRouter
        "StepFlow-generated router."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cjm_fasthtml_workflow_transcription_single_file-0.0.19.tar.gz.

File metadata

File hashes

Hashes for cjm_fasthtml_workflow_transcription_single_file-0.0.19.tar.gz
Algorithm Hash digest
SHA256 99ddb482e453d70f317554a06adedfcbd1ed765c2a0539cd29108b2e9aebdc33
MD5 74b59a2de80d848891bb57d8493f40a1
BLAKE2b-256 87ec1aa900b844b11126f563833237f4113ab9136b0ca67bc27b322c08ec6cc5

See more details on using hashes here.

File details

Details for the file cjm_fasthtml_workflow_transcription_single_file-0.0.19-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_fasthtml_workflow_transcription_single_file-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 e6c64aed3ba07b25fdbf92d8c8b59fe0e8885ac58832219207929f20ff5fc4fc
MD5 f6d38fa0c9e67b5c20d44c0dab336658
BLAKE2b-256 2af3fb75adca40b9dba2868b41b1fd1b31afcd5eea1bb35ccfd9464cb95dc5a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page