Skip to main content

FastHTML source selection component for transcript decomposition workflows, with federated database browsing, drag-drop ordering, and keyboard navigation.

Project description

cjm-transcript-source-select

Install

pip install cjm_transcript_source_select

Project Structure

nbs/
├── components/ (6)
│   ├── helpers.ipynb          # Shared helper functions for the selection module
│   ├── local_files.ipynb      # Local files browser for importing external .db files
│   ├── preview_panel.ipynb    # Collapsible preview panel for displaying selected content
│   ├── selection_queue.ipynb  # Selection queue component with drag-drop reordering
│   ├── source_browser.ipynb   # Source browser components for displaying and filtering transcription sources
│   └── step_renderer.ipynb    # Phase 1 step renderer: Source Selection & Ordering with two-column layout and collapsible preview
├── routes/ (7)
│   ├── core.ipynb            # Selection step state management helpers
│   ├── filtering.ipynb       # Filtering, grouping, and keyboard navigation route handlers
│   ├── init.ipynb            # Router assembly for Phase 1 selection routes
│   ├── local_files.ipynb     # Local files browser route handlers
│   ├── queue.ipynb           # Selection queue route handlers for Phase 1
│   ├── source_browser.ipynb  # Source browser virtual collection router for Phase 1 selection
│   └── tabs.ipynb            # Tab switching route handlers
├── services/ (2)
│   ├── source.ipynb        # Source service for federated transcription queries via DuckDB
│   └── source_utils.ipynb  # Source record operations for metadata extraction, grouping, and validation
├── html_ids.ipynb  # HTML ID constants for Phase 1: Source Selection & Ordering
├── models.ipynb    # Data models and URL bundles for Phase 1: Source Selection & Ordering
└── utils.ipynb     # Display formatting and word counting utilities for the selection step

Total: 18 notebooks across 3 directories

Module Dependencies

graph LR
    components_helpers[components.helpers<br/>helpers]
    components_local_files[components.local_files<br/>local_files]
    components_preview_panel[components.preview_panel<br/>preview_panel]
    components_selection_queue[components.selection_queue<br/>selection_queue]
    components_source_browser[components.source_browser<br/>source_browser]
    components_step_renderer[components.step_renderer<br/>step_renderer]
    html_ids[html_ids<br/>html_ids]
    models[models<br/>models]
    routes_core[routes.core<br/>core]
    routes_filtering[routes.filtering<br/>filtering]
    routes_init[routes.init<br/>init]
    routes_local_files[routes.local_files<br/>local_files]
    routes_queue[routes.queue<br/>queue]
    routes_source_browser[routes.source_browser<br/>source_browser]
    routes_tabs[routes.tabs<br/>tabs]
    services_source[services.source<br/>source]
    services_source_utils[services.source_utils<br/>source_utils]
    utils[utils<br/>utils]

    components_helpers --> models
    components_local_files --> components_helpers
    components_local_files --> html_ids
    components_preview_panel --> html_ids
    components_source_browser --> services_source_utils
    components_source_browser --> utils
    components_source_browser --> html_ids
    components_step_renderer --> components_preview_panel
    components_step_renderer --> models
    components_step_renderer --> utils
    components_step_renderer --> components_source_browser
    components_step_renderer --> components_selection_queue
    components_step_renderer --> components_local_files
    components_step_renderer --> html_ids
    routes_core --> models
    routes_core --> components_step_renderer
    routes_core --> html_ids
    routes_core --> components_selection_queue
    routes_core --> services_source
    routes_filtering --> models
    routes_filtering --> routes_core
    routes_filtering --> services_source_utils
    routes_filtering --> services_source
    routes_init --> routes_core
    routes_init --> routes_queue
    routes_init --> models
    routes_init --> routes_local_files
    routes_init --> routes_source_browser
    routes_init --> routes_filtering
    routes_init --> routes_tabs
    routes_init --> services_source
    routes_local_files --> components_local_files
    routes_local_files --> models
    routes_local_files --> services_source
    routes_local_files --> routes_core
    routes_queue --> services_source_utils
    routes_queue --> routes_core
    routes_queue --> components_preview_panel
    routes_queue --> models
    routes_queue --> services_source
    routes_source_browser --> routes_core
    routes_source_browser --> components_preview_panel
    routes_source_browser --> models
    routes_source_browser --> components_source_browser
    routes_source_browser --> services_source_utils
    routes_source_browser --> services_source
    routes_source_browser --> html_ids
    routes_tabs --> routes_core
    routes_tabs --> services_source_utils
    routes_tabs --> models
    routes_tabs --> components_step_renderer
    routes_tabs --> services_source

52 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

core (core.ipynb)

Selection step state management helpers

Import

from cjm_transcript_source_select.routes.core import (
    DEBUG_SELECTION_STATE,
    WorkflowStateStore
)

Functions

def _get_step_state(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    session_id: str  # Session identifier string
) -> Dict[str, Any]:  # Step state dictionary
    "Get the selection step state from the workflow state store."
def _find_duplicate_media_source(
    source_service: SourceService,  # Source service for lookups
    record_id: str,  # Candidate record ID
    provider_id: str,  # Candidate provider ID
    selected_sources: List[Dict[str, str]],  # Current selections
) -> Optional[Dict[str, str]]:  # Conflicting source dict or None
    "Find an already-selected source that shares the same audio file."
def _render_duplicate_flash(
    candidate_row_id: str,  # DOM element ID of the candidate row
    existing_row_id: Optional[str] = None,  # DOM element ID of the conflicting row (None if off-screen)
) -> Div:  # OOB Div with flash script
    "Render a flash animation on one or two rows to indicate duplicate rejection."
def _get_active_source_tab(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    session_id: str  # Session identifier string
) -> str:  # Active tab: "db" or "files"
    "Get the currently active source tab from workflow state."
def _build_queue_response(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for querying transcriptions
    session_id: str,  # Session identifier string
    selected_sources: List[Dict[str, str]],  # Current selected sources after mutation
    urls: SelectionUrls,  # URL bundle for rendering
    include_stats: bool = True,  # Include OOB stats swap
    include_checkbox_oobs: bool = True,  # Include OOB checkbox cells for visible rows
) -> Union[Any, Tuple]:  # Single component or tuple of components with OOB swaps
    "Build the standard response for queue-mutating handlers."
def _update_step_state(
    "Update the selection step state in the workflow state store."

Variables

DEBUG_SELECTION_STATE = False
_rebuild_and_render_ref: list
_sync_items_ref: list
_get_checkbox_oobs_ref: list
_get_checkbox_oob_for_ref: list
_get_vc_row_id_for_ref: list
_activate_toggle_ref: list

filtering (filtering.ipynb)

Filtering, grouping, and keyboard navigation route handlers

Import

from cjm_transcript_source_select.routes.filtering import (
    init_filtering_router
)

Functions

def _handle_source_filter(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    search: str,  # Search term from input
    urls: SelectionUrls,  # URL bundle for rendering
):  # VC content wrapper (direct swap, not OOB)
    "Filter transcription sources by search term."
def _handle_grouping_change(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    grouping_mode: str,  # New grouping mode: "media_path" or "batch_id"
    urls: SelectionUrls,  # URL bundle for rendering
):  # VC content wrapper (direct swap, not OOB)
    "Change the grouping mode and re-render the VC content."
def _handle_selection_toggle_focused(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    record_id: str,  # Job ID from focused row (via hx-include)
    provider_id: str,  # Plugin name from focused row (via hx-include)
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component with OOB stats, optionally with OOB source list
    "Toggle selection of the focused row (keyboard shortcut handler)."
def _handle_keyboard_reorder(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    record_id: str,  # Record ID of item to move
    provider_id: str,  # Provider ID of item to move
    direction: str,  # Direction to move: "up" or "down"
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component, optionally with OOB source list
    "Move an item up or down in the selection queue via keyboard."
def init_filtering_router(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    prefix: str,  # Route prefix (e.g., "/workflow/selection/filtering")
    urls: SelectionUrls,  # URL bundle for rendering
) -> Tuple[APIRouter, Dict[str, Callable]]:  # (router, route_dict)
    "Initialize filtering and keyboard navigation routes."

helpers (helpers.ipynb)

Shared helper functions for the selection module

Import

from cjm_transcript_source_select.components.helpers import *

Functions

def _get_selection_state(
    ctx: InteractionContext  # Interaction context with state
) -> SelectionStepState:  # Typed selection step state
    "Get the full selection step state from context."
def _get_selected_sources(
    ctx: InteractionContext  # Interaction context with state
) -> List[SelectedSource]:  # List of selected source dicts
    "Get the list of selected sources from step state."
def _get_grouping_mode(
    ctx: InteractionContext  # Interaction context with state
) -> str:  # Grouping mode: "media_path" or "batch_id"
    "Get the current grouping mode from step state."

html_ids (html_ids.ipynb)

HTML ID constants for Phase 1: Source Selection & Ordering

Import

from cjm_transcript_source_select.html_ids import (
    SelectionHtmlIds
)

Classes

class SelectionHtmlIds:
    "HTML ID constants for Phase 1: Source Selection & Ordering."
    
    def as_selector(
            id_str:str  # The HTML ID to convert
        ) -> str:  # CSS selector with # prefix
        "Convert an ID to a CSS selector format."
    
    def source_checkbox(
            record_id:str,  # Record identifier
            provider_id:str  # Provider identifier
        ) -> str:  # HTML ID for the source checkbox
        "Generate HTML ID for a source selection checkbox."
    
    def source_row(
            record_id:str,  # Record identifier
            provider_id:str  # Provider identifier
        ) -> str:  # HTML ID for the source row
        "Generate HTML ID for a source browser row."
    
    def queue_item(
            record_id:str,  # Record identifier
            provider_id:str  # Provider identifier
        ) -> str:  # HTML ID for the queue item
        "Generate HTML ID for a queue item."

init (init.ipynb)

Router assembly for Phase 1 selection routes

Import

from cjm_transcript_source_select.routes.init import (
    init_selection_routers
)

Functions

def init_selection_routers(
    state_store: WorkflowStateStore,  # The workflow state store
    source_service: SourceService,  # The source service for queries
    workflow_id: str,  # The workflow identifier
    prefix: str,  # Base prefix for selection routes (e.g., "/workflow/selection")
) -> Tuple[List[APIRouter], SelectionUrls, Dict[str, Callable], Callable, "SourceBrowserRouterState"]
    "Initialize and return all selection routers with URL bundle."

local_files (local_files.ipynb)

Local files browser for importing external .db files

Import

from cjm_transcript_source_select.components.local_files import *

Functions

def _get_external_db_paths(
    ctx: InteractionContext  # Interaction context with state
) -> List[str]:  # List of external database paths
    "Get the list of external database paths from step state."
def _get_current_browse_path(
    ctx: InteractionContext  # Interaction context with state
) -> str:  # Current browse path
    "Get the current browse path from step state."
def _get_file_browser_state(
    step_state: Dict[str, Any],  # Selection step state dictionary
    default_path: Optional[str] = None  # Default path if no state exists
) -> BrowserState:  # BrowserState for file browser
    "Get or create BrowserState from step state."
def _create_db_browser_config() -> FileBrowserConfig:  # Configured FileBrowserConfig for .db file selection
    "Create file browser config for .db file selection."
def _render_external_sources_list(
    external_paths: List[str],  # List of added external database paths
    remove_url: str,  # URL for removing external source
    oob: bool = False,  # Whether to render as OOB swap
) -> Any:  # External sources section component (always rendered for OOB targeting)
    "Render the list of added external database sources with scrollable paths."
def _render_error_alert(
    error_message: Optional[str] = None,  # Error message to display (None = clear)
    oob: bool = False,  # Whether to render as OOB swap
) -> Any:  # Error alert container (always present for OOB targeting)
    "Render the error alert container for the local files browser."
def _render_local_files_browser(
    render_fn: Optional[Callable] = None,  # FileBrowserRouters.render callable
    external_paths: Optional[List[str]] = None,  # List of added external database paths
    remove_url: str = "",  # URL for removing external source
    error_message: Optional[str] = None,  # Error message to display
) -> Any:  # Local files browser component
    "Render the local files browser for adding external .db files."

local_files (local_files.ipynb)

Local files browser route handlers

Import

from cjm_transcript_source_select.routes.local_files import (
    init_local_files_router
)

Functions

def _get_local_files_provider() -> LocalFileSystemProvider:
    """Get or create the local files provider singleton."""
    global _local_files_provider
    if _local_files_provider is None
    "Get or create the local files provider singleton."
def _handle_remove_external_source(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for external db ops
    sess,  # FastHTML session object
    db_path: str,  # Path to the .db file to remove
    external_db_paths_ref: List[str],  # Shared external paths list (mutated in place)
    fb_routers: FileBrowserRouters,  # File browser routers (for targeted OOB)
    remove_url: str,  # URL for remove button in external sources list
):  # Tuple of OOB elements (external sources list + checkbox cells)
    "Remove an external database source from the Added Sources list."
def init_local_files_router(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for external db ops
    prefix: str,  # Route prefix (e.g., "/workflow/selection/local_files")
    urls: SelectionUrls,  # URL bundle for rendering
) -> Tuple[List, Dict[str, Callable], Callable]:  # (routers, route_dict, render_panel_fn)
    "Initialize local files browser routes with new file browser API."

Variables

_local_files_provider: Optional[LocalFileSystemProvider] = None

models (models.ipynb)

Data models and URL bundles for Phase 1: Source Selection & Ordering

Import

from cjm_transcript_source_select.models import (
    SelectionStepState,
    SelectionUrls
)

Classes

class SelectionStepState(TypedDict):
    "State for Phase 1: Source Selection & Ordering."
@dataclass
class SelectionUrls:
    "URL bundle for Phase 1 selection route handlers and renderers."
    
    add: str = ''  # Add source to queue
    remove: str = ''  # Remove source from queue
    toggle: str = ''  # Toggle source selection (add/remove based on current state)
    reorder: str = ''  # Reorder queue items
    clear: str = ''  # Clear all from queue
    select_all: str = ''  # Select all in a group
    preview: str = ''  # Preview source content
    toggle_focused: str = ''  # Toggle focused row selection
    keyboard_reorder: str = ''  # Keyboard reorder (Shift+Up/Down)
    filter: str = ''  # Filter source list
    grouping_change: str = ''  # Change grouping mode
    browse_directory: str = ''  # Browse directory
    add_external: str = ''  # Add external .db source
    remove_external: str = ''  # Remove external .db source
    tab_switch: str = ''  # Switch source tabs

preview_panel (preview_panel.ipynb)

Collapsible preview panel for displaying selected content

Import

from cjm_transcript_source_select.components.preview_panel import *

Functions

def _render_preview_panel(
    preview_record_id: Optional[str] = None,  # Job ID being previewed
    preview_text: Optional[str] = None,  # Text content to preview
    is_open: bool = False,  # Whether the collapse should be open
) -> Any:  # Preview panel component (collapsible, full-width)
    "Render the collapsible preview panel for displaying selected content."

queue (queue.ipynb)

Selection queue route handlers for Phase 1

Import

from cjm_transcript_source_select.routes.queue import (
    init_queue_router
)

Functions

def _handle_selection_toggle(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    record_id: str,  # Job ID to toggle
    provider_id: str,  # Plugin name for the source
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component with OOB stats (no checkbox OOBs -- checkbox already correct)
    "Toggle a source's selection state (add if absent, remove if present)."
def _handle_selection_add(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    record_id: str,  # Job ID to add
    provider_id: str,  # Plugin name for the source
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component with OOB stats and visible checkbox OOBs
    "Add a source to the selection queue."
def _handle_selection_remove(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    key: str,  # Item key (record_id) to remove
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component with OOB stats and visible checkbox OOBs
    "Remove a source from the selection queue by key."
async def _handle_selection_reorder(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    urls: SelectionUrls,  # URL bundle for rendering
):  # Updated queue component
    "Reorder items in the selection queue based on SortableJS result."
def _handle_selection_clear(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component with OOB stats, optionally with OOB source list
    "Clear all items from the selection queue."
def _handle_selection_select_all(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    group_key: str,  # Group key to select all transcriptions for
    grouping_mode: str,  # Current grouping mode: "media_path" or "batch_id"
    urls: SelectionUrls,  # URL bundle for rendering
):  # Queue component with OOB stats, optionally with OOB source list
    "Select all transcriptions for a given group, skipping duplicate audio sources."
def _handle_selection_preview(
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    record_id: str,  # Job ID to preview
    provider_id: str,  # Plugin name for the source
):  # Full preview panel component (collapsible, open with content)
    "Get preview panel for a selected source."
def init_queue_router(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    prefix: str,  # Route prefix (e.g., "/workflow/selection/queue")
    urls: SelectionUrls,  # URL bundle for rendering (populated after all routers created)
) -> Tuple[APIRouter, Dict[str, Callable]]:  # (router, route_dict)
    "Initialize queue management routes."

selection_queue (selection_queue.ipynb)

Selection queue component with drag-drop reordering

Import

from cjm_transcript_source_select.components.selection_queue import (
    SD_QUEUE_PREFIX,
    SD_QUEUE_CONFIG,
    SD_QUEUE_IDS
)

Functions

def _render_queue_content(
    item: dict,  # Source dict with record_id and provider_id
    index: int,  # 0-based position in queue
) -> Any:  # Custom content for the queue item
    "Render the job ID display as custom content for each queue item."
def _render_queue_empty() -> Any:  # Empty state element
    "Render the custom empty state for the source selection queue."
def _render_selection_queue(
    selected_sources: List[Dict[str, str]],  # List of selected sources in order
    remove_url: str,  # URL for removing from queue
    reorder_url: str,  # URL for reordering queue
    clear_url: str,  # URL for clearing all
) -> Any:  # Queue panel component
    "Render the selection queue panel via cjm-fasthtml-sortable-queue."

Variables

SD_QUEUE_PREFIX = 'sd'
SD_QUEUE_CONFIG
SD_QUEUE_IDS

source (source.ipynb)

Source service for federated transcription queries via DuckDB

Import

from cjm_transcript_source_select.services.source import (
    VALID_DB_EXTENSIONS,
    TranscriptionDBProvider,
    SourceService,
    validate_and_toggle_external_db
)

Functions

def validate_and_toggle_external_db(
    source_service: SourceService,  # Source service for duplicate detection
    path: str,  # Path to the .db file
    external_paths: List[str],  # Current external database paths
    valid_extensions: List[str] = None,  # Valid file extensions (default: VALID_DB_EXTENSIONS)
) -> Tuple[List[str], Optional[str]]:  # (updated_paths, error_message or None)
    "Validate and toggle an external database path in the external paths list."

Classes

class TranscriptionDBProvider:
    def __init__(
        self,
        db_path: str,  # Path to SQLite database file
        name: str,  # Display name for this provider
        provider_id: Optional[str] = None  # Unique ID (defaults to db_path)
    )
    "SourceProvider for transcription SQLite databases."
    
    def __init__(
            self,
            db_path: str,  # Path to SQLite database file
            name: str,  # Display name for this provider
            provider_id: Optional[str] = None  # Unique ID (defaults to db_path)
        )
        "Initialize provider for a transcription database."
    
    def provider_id(self) -> str:  # Unique identifier
            """Unique identifier for this provider instance."""
            return self._id
        
        @property
        def provider_name(self) -> str:  # Display name
        "Unique identifier for this provider instance."
    
    def provider_name(self) -> str:  # Display name
            """Human-readable name for display."""
            return self._name
        
        @property
        def provider_type(self) -> str:  # Provider category
        "Human-readable name for display."
    
    def provider_type(self) -> str:  # Provider category
            """Provider type category."""
            return "transcription_db"
        
        @property
        def db_path(self) -> Path:  # Database file path
        "Provider type category."
    
    def db_path(self) -> Path:  # Database file path
            """Path to the underlying database file."""
            return self._db_path
        
        def is_available(self) -> bool:  # Whether database exists and is accessible
        "Path to the underlying database file."
    
    def is_available(self) -> bool:  # Whether database exists and is accessible
            """Check if the database file exists and is accessible."""
            return self._db_path.exists() and self._db_path.suffix == '.db'
        
        def validate_schema(self) -> Tuple[bool, str]:  # (is_valid, error_message)
        "Check if the database file exists and is accessible."
    
    def validate_schema(self) -> Tuple[bool, str]:  # (is_valid, error_message)
            """Check if database has valid transcription schema."""
            if not self.is_available()
        "Check if database has valid transcription schema."
    
    def query_records(
            self,
            limit: int = 100  # Maximum records to return
        ) -> List[SourceRecord]:  # List of source records
        "Query transcription records from the database."
    
    def get_source_block(
            self,
            record_id: str  # Job ID to fetch
        ) -> Optional[SourceBlock]:  # SourceBlock or None if not found
        "Fetch a specific transcription as a SourceBlock."
    
    def from_plugin(
            cls,
            meta: PluginMeta  # Plugin metadata with manifest containing db_path
        ) -> Optional["TranscriptionDBProvider"]:  # Provider or None if no valid db_path
        "Create provider from plugin metadata."
    
    def from_external_path(
            cls,
            path: str  # Path to external database file
        ) -> Optional["TranscriptionDBProvider"]:  # Provider or None if path invalid
        "Create provider from an external database path."
class SourceService:
    def __init__(
        self,
        plugin_manager: PluginManager,  # Plugin manager for discovering plugin sources
        source_categories: List[str] = None,  # Plugin categories to query (default: ['transcription'])
        external_paths: List[str] = None  # External database paths
    )
    "Service for federated access to content sources via providers."
    
    def __init__(
            self,
            plugin_manager: PluginManager,  # Plugin manager for discovering plugin sources
            source_categories: List[str] = None,  # Plugin categories to query (default: ['transcription'])
            external_paths: List[str] = None  # External database paths
        )
        "Initialize the source service."
    
    def add_provider(
            self,
            provider: SourceProvider  # Provider instance to add
        ) -> bool:  # True if added, False if ID already exists
        "Add a source provider."
    
    def remove_provider(
            self,
            provider_id: str  # ID of provider to remove
        ) -> bool:  # True if removed, False if not found
        "Remove a source provider by ID."
    
    def get_provider(
            self,
            provider_id: str  # ID of provider to get
        ) -> Optional[SourceProvider]:  # Provider or None if not found
        "Get a provider by ID."
    
    def get_providers(self) -> List[SourceProvider]:  # List of all providers
            """Get all registered providers."""
            return list(self._providers.values())
        
        def get_provider_by_name(
            self,
            name: str  # Provider name to search for
        ) -> Optional[SourceProvider]:  # Provider or None if not found
        "Get all registered providers."
    
    def get_provider_by_name(
            self,
            name: str  # Provider name to search for
        ) -> Optional[SourceProvider]:  # Provider or None if not found
        "Find a provider by its display name."
    
    def has_provider_for_path(
            self,
            path: str  # Path to check
        ) -> Tuple[bool, Optional[str]]:  # (has_duplicate, existing_provider_name)
        "Check if any provider uses the same resolved database path."
    
    def add_plugin_providers(self) -> int:  # Number of providers added
            """Discover and add providers from loaded plugins."""
            added = 0
            for category in self._categories
        "Discover and add providers from loaded plugins."
    
    def set_external_paths(
            self,
            paths: List[str]  # List of external database paths to set
        ) -> None
        "Set external database paths (replaces existing external providers)."
    
    def add_external_path(
            self,
            path: str  # External database path to add
        ) -> bool:  # True if added, False if already exists or invalid
        "Add an external database as a provider."
    
    def remove_external_path(
            self,
            path: str  # External database path to remove
        ) -> bool:  # True if removed, False if not found
        "Remove an external database provider."
    
    def get_external_paths(self) -> List[str]:  # List of external database paths
            """Get list of external database paths."""
            paths = []
            for pid, provider in self._providers.items()
        "Get list of external database paths."
    
    def get_available_sources(self) -> List[Dict[str, Any]]:  # List of source info dicts
            """Get list of available sources (for UI display)."""
            # First ensure plugin providers are loaded
            self.add_plugin_providers()
            
            sources = []
            for provider in self._providers.values()
        "Get list of available sources (for UI display)."
    
    def query_transcriptions(
            self,
            provider_name: Optional[str] = None,  # Filter by provider name (None for all)
            limit: int = 100  # Maximum number of results per provider
        ) -> List[Dict[str, Any]]:  # List of transcription records
        "Query records from all providers (or a specific one)."
    
    def get_transcription_by_id(
            self,
            record_id: str,  # Record ID to fetch
            provider_id: str  # Provider ID that owns this record
        ) -> Optional[SourceBlock]:  # SourceBlock or None if not found
        "Get a specific transcription as a SourceBlock."
    
    def get_source_blocks(
            self,
            selections: List[Dict[str, str]]  # List of {record_id, provider_id} dicts
        ) -> List[SourceBlock]:  # Ordered list of SourceBlocks
        "Fetch multiple records as SourceBlocks in order."

Variables

VALID_DB_EXTENSIONS = [3 items]

source_browser (source_browser.ipynb)

Source browser components for displaying and filtering transcription sources

Import

from cjm_transcript_source_select.components.source_browser import (
    SOURCE_BROWSER_COLUMNS,
    SB_SYSTEM_ID,
    SourceBrowserItem,
    build_source_items,
    is_source_item_skippable,
    create_source_cell_renderer,
    render_source_empty
)

Functions

def _render_grouping_selector(
    grouping_mode: str,  # Current grouping mode: "media_path" or "batch_id"
    grouping_change_url: str,  # URL for changing grouping mode
) -> Any:  # Grouping selector component
    "Render the dropdown for selecting grouping mode."
def build_source_items(
    transcriptions: List[Dict[str, Any]],  # Available transcription records
    selected_sources: List[Dict[str, str]],  # Currently selected sources
    grouping_mode: str = "media_path",  # Grouping mode: "media_path" or "batch_id"
) -> List[SourceBrowserItem]:  # Flat list with interleaved headers and records
    "Build the items list for the source browser virtual collection."
def is_source_item_skippable(
    item: SourceBrowserItem,  # Item to check
) -> bool:  # True if item is a group header (cursor should skip)
    "Predicate for virtual collection is_skippable parameter."
def _render_header_cell(
    item: SourceBrowserItem,  # Header item
    ctx: CellRenderContext,  # Cell render context
    select_all_url: str = "",  # URL for selecting all in group
) -> Any:  # Cell content for a group header row
    "Render cell content for a group header item."
def _render_record_cell(
    item: SourceBrowserItem,  # Record item
    ctx: CellRenderContext,  # Cell render context
    toggle_url: str = "",  # URL for toggling source selection
) -> Any:  # Cell content for a data record row
    "Render cell content for a data record item."
def create_source_cell_renderer(
    toggle_url: str = "",  # URL for toggling source selection
    select_all_url: str = "",  # URL for selecting all in a group
) -> Callable:  # render_cell(item: SourceBrowserItem, ctx: CellRenderContext) -> Any
    "Create a render_cell callback for the source browser virtual collection."
def render_source_empty() -> Any:  # Empty state component
    "Render empty state when no transcription sources are available."
def _render_source_browser_vc_content(
    sb_state: Any,  # SourceBrowserRouterState from routes.source_browser
) -> Any:  # VC content wrapper (without search/grouping header)
    "Render the VC content portion of the source browser."
def _render_source_browser_vc(
    sb_state: Any,  # SourceBrowserRouterState from routes.source_browser
    filter_url: str = "",  # URL for filtering sources
    grouping_mode: str = "media_path",  # Current grouping mode
    grouping_change_url: str = "",  # URL for changing grouping mode
) -> Any:  # Source browser component with virtual collection
    "Render the full source browser panel (header + VC content)."

Classes

@dataclass
class SourceBrowserItem:
    "Item in the source browser virtual collection (header or record)."
    
    item_type: str  # "header" or "record"
    group_key: str = ''  # Group key (media_path or batch_id value)
    group_display: str = ''  # Formatted display text for group header
    group_count: int = 0  # Number of records in this group
    grouping_mode: str = ''  # Grouping mode used ("media_path" or "batch_id")
    record: Optional[Dict[str, Any]]  # Original transcription record dict
    is_selected: bool = False  # Whether currently in queue

Variables

SOURCE_BROWSER_COLUMNS
_SB_CONTENT_ID = 'sb-content'
_SB_VC_WRAPPER_ID = 'sb-vc-wrapper'
SB_SYSTEM_ID = 'sb-collection'

source_browser (source_browser.ipynb)

Source browser virtual collection router for Phase 1 selection

Import

from cjm_transcript_source_select.routes.source_browser import (
    SourceBrowserRouterState,
    init_source_browser_router
)

Functions

def init_source_browser_router(
    source_service: SourceService,  # Source service for querying transcriptions
    urls: SelectionUrls,  # URL bundle (toggle, select_all, filter, grouping_change)
    prefix: str = "/browser",  # Route prefix for VC routes
) -> SourceBrowserRouterState:  # Router state with all VC objects and helpers
    "Initialize the source browser virtual collection router."

Classes

@dataclass
class SourceBrowserRouterState:
    "Return value from init_source_browser_router."
    
    router: APIRouter  # VC routes (nav, focus, activate, sort, viewport)
    urls: VirtualCollectionUrls  # VC URL bundle
    ids: VirtualCollectionHtmlIds  # VC HTML IDs
    btn_ids: VirtualCollectionButtonIds  # VC keyboard button IDs
    config: VirtualCollectionConfig  # VC config
    state: VirtualCollectionState  # VC state (mutable)
    items: List[SourceBrowserItem]  # Shared items list (mutable)
    render_cell: Callable  # Cell render callback
    rebuild_and_render: Callable  # (transcriptions, selected_sources, grouping_mode, content_only) -> Div
    rebuild_items: Callable  # (transcriptions, selected_sources, grouping_mode) -> None
    sync_items_selection: Callable  # (selected_sources) -> None
    get_visible_checkbox_oobs: Callable  # () -> tuple of OOB elements
    get_checkbox_oob_for: Callable  # (record_id, provider_id) -> OOB element or None
    get_vc_row_id_for: Callable  # (record_id, provider_id) -> str or None

source_utils (source_utils.ipynb)

Source record operations for metadata extraction, grouping, and validation

Import

from cjm_transcript_source_select.services.source_utils import (
    extract_batch_id,
    extract_model_name,
    group_transcriptions,
    group_transcriptions_by_audio,
    is_source_selected,
    get_selected_media_paths,
    filter_transcriptions,
    select_all_in_group,
    toggle_source_selection,
    reorder_item,
    reorder_sources,
    calculate_next_tab,
    check_audio_exists,
    validate_browse_path
)

Functions

def extract_batch_id(
    metadata: Any  # Metadata dict or JSON string
) -> str:  # Batch ID or "No Batch ID"
    "Extract batch_id from transcription metadata."
def extract_model_name(
    metadata: Any  # Metadata dict or JSON string
) -> str:  # Formatted model name for display
    "Extract and format model name from transcription metadata."
def group_transcriptions(
    transcriptions: List[Dict[str, Any]],  # List of transcription records
    group_by: str = "media_path"  # Grouping mode: "media_path" or "batch_id"
) -> Dict[str, List[Dict[str, Any]]]:  # Grouped transcriptions
    "Group transcription records by the specified field."
def group_transcriptions_by_audio(
    transcriptions: List[Dict[str, Any]]  # List of transcription records
) -> Dict[str, List[Dict[str, Any]]]:  # Grouped by media_path
    "Group transcription records by their source audio file."
def is_source_selected(
    record_id: str,  # Job ID to check
    provider_id: str,  # Provider ID to check
    selected_sources: List[Dict[str, str]]  # List of selected sources
) -> bool:  # True if source is selected
    "Check if a source is in the selected list by (record_id, provider_id) pair."
def get_selected_media_paths(
    selected_sources: List[Dict[str, str]],  # Current selections (record_id, provider_id)
    all_transcriptions: List[Dict[str, Any]],  # All available transcription records
) -> Set[str]:  # Media paths already represented in selections
    "Get the set of media_paths for currently selected sources."
def filter_transcriptions(
    transcriptions: List[Dict[str, Any]],  # List of transcription records to filter
    search_text: str,  # Search term for case-insensitive substring matching
) -> List[Dict[str, Any]]:  # Filtered transcription records
    "Filter transcriptions by substring match across record_id, media_path, and text fields."
def select_all_in_group(
    transcriptions: List[Dict[str, Any]],  # All transcription records
    group_key: str,  # Group key to match against
    grouping_mode: str,  # Grouping mode: "media_path" or "batch_id"
    selected_sources: List[Dict[str, str]],  # Current selections
    excluded_media_paths: Optional[Set[str]] = None,  # Media paths to skip (already selected)
) -> List[Dict[str, str]]:  # Updated selections with new items appended
    "Add all transcriptions matching a group key to the selection list, skipping duplicates."
def toggle_source_selection(
    record_id: str,  # Job ID to toggle
    provider_id: str,  # Plugin name for the source
    selected_sources: List[Dict[str, str]],  # Current selections
) -> List[Dict[str, str]]:  # Updated selections
    "Toggle a source in or out of the selection list by (record_id, provider_id) pair."
def reorder_item(
    selected_sources: List[Dict[str, str]],  # Current selections
    record_id: str,  # Record ID of item to move
    provider_id: str,  # Provider ID of item to move
    direction: str,  # Direction: "up" or "down"
) -> List[Dict[str, str]]:  # Reordered selections
    "Move an item up or down in the selection list by swapping with its neighbor."
def reorder_sources(
    selected_sources: List[Dict[str, str]],  # Current selections
    new_order_ids: List[str],  # Job IDs in desired order
) -> List[Dict[str, str]]:  # Reordered selections
    "Reorder sources to match the given job ID order."
def calculate_next_tab(
    direction: str,  # Direction: "prev", "next", or a direct tab name
    current_tab: str,  # Currently active tab name
    tabs: List[str],  # Available tab names in order
) -> str:  # New active tab name
    "Calculate the next tab based on direction or direct selection."
def check_audio_exists(
    media_path: str  # Path to audio file
) -> bool:  # True if file exists
    "Check if the audio file exists at the given path."
def validate_browse_path(
    path: str  # Path to validate
) -> str:  # Validated and resolved path, or home directory on error
    "Validate a browse path for security. Returns home directory on invalid input."

step_renderer (step_renderer.ipynb)

Phase 1 step renderer: Source Selection & Ordering with two-column layout and collapsible preview

Import

from cjm_transcript_source_select.components.step_renderer import (
    SD_TAB_PREV_BTN,
    SD_TAB_NEXT_BTN,
    SD_PREVIEW_BTN,
    FB_SYSTEM_ID,
    render_selection_step
)

Functions

def _create_parent_keyboard_manager() -> ZoneManager:  # Parent keyboard manager for hierarchy
    "Create the parent keyboard manager with two ghost zones for column switching."
def _render_selection_keyboard_hints(
    manager: ZoneManager,  # Keyboard zone manager with actions configured
) -> Any:  # Collapsible keyboard hints component
    "Render keyboard shortcut hints in a collapsible container."
def _render_selection_stats(
    selected_sources: List[Dict[str, str]],  # Selected sources
    transcriptions: List[Dict[str, Any]],  # All transcriptions (for word count)
    oob: bool = False,  # Whether to render as OOB swap
) -> Any:  # Stats component
    "Render the selection statistics (word count and source count)."
def _render_selection_footer(
    selected_sources: List[Dict[str, str]],  # Selected sources
    transcriptions: List[Dict[str, Any]],  # All transcriptions (for word count)
) -> Any:  # Footer component
    "Render the footer with statistics and continue button."
def _render_tab_headers(
    active_tab: str,  # Currently active tab ('db' or 'files')
    tab_switch_url: str = "",  # URL for switching tabs via HTMX
    oob: bool = False,  # Whether to render as OOB swap
) -> Any:  # Tab headers container
    "Render the tab header radio inputs."
def _render_source_tabs(
    active_tab: str,  # Currently active tab ('db' or 'files')
    active_content: Any,  # Content for the currently active tab
    tab_switch_url: str = "",  # URL for switching tabs via HTMX
) -> Any:  # Tabs header + separate content container
    "Render source type tabs with a single shared content container."
def _generate_hierarchy_js(
    active_tab: str,  # Active tab: "db" or "files"
) -> Script:  # Script element with hierarchy wiring and activation logic
    "Generate JavaScript for keyboard system hierarchy and child activation."
def render_selection_step(
    sources: List[Dict[str, Any]],  # Available source plugins
    transcriptions: List[Dict[str, Any]],  # Available transcription records
    selected_sources: List[Dict[str, str]],  # Ordered selection
    grouping_mode: str,  # Grouping mode: "media_path" or "batch_id"
    active_tab: str,  # Active tab: "db" or "files"
    urls: SelectionUrls,  # URL bundle for selection routes
    render_local_files_panel: Optional[Callable] = None,  # Render fn for Files tab content
    sb_state: Any = None,  # SourceBrowserRouterState for DB tab VC rendering
) -> Any:  # FastHTML component
    "Render Phase 1: Source Selection & Ordering step with two-column layout."

Variables

SD_TAB_PREV_BTN = 'sd-tab-prev-btn'
SD_TAB_NEXT_BTN = 'sd-tab-next-btn'
SD_PREVIEW_BTN = 'sd-preview-btn'
FB_SYSTEM_ID = 'lfb-collection'
_ZONE_FOCUS_CLASSES
_VIEWPORT_FIT_CONFIG

tabs (tabs.ipynb)

Tab switching route handlers

Import

from cjm_transcript_source_select.routes.tabs import (
    init_tabs_router
)

Functions

def _handle_tab_switch(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    request,  # FastHTML request object
    sess,  # FastHTML session object
    direction: str,  # Direction: "prev", "next", "db", or "files"
    urls: SelectionUrls,  # URL bundle for rendering
    render_local_files_panel: Optional[Callable] = None,  # Render fn for Files tab
    sb_state: Any = None,  # SourceBrowserRouterState for DB tab VC rendering
):  # Tuple of inner content, OOB tab headers, and tab switch script
    "Switch between Plugin DB and Local Files tabs."
def init_tabs_router(
    state_store: WorkflowStateStore,  # The workflow state store
    workflow_id: str,  # The workflow identifier
    source_service: SourceService,  # The source service for queries
    prefix: str,  # Route prefix (e.g., "/workflow/selection/tabs")
    urls: SelectionUrls,  # URL bundle for rendering
    render_local_files_panel: Optional[Callable] = None,  # Render fn for Files tab content
    sb_state: Any = None,  # SourceBrowserRouterState for DB tab VC rendering
) -> Tuple[APIRouter, Dict[str, Callable]]:  # (router, route_dict)
    "Initialize tab switching routes."

utils (utils.ipynb)

Display formatting and word counting utilities for the selection step

Import

from cjm_transcript_source_select.utils import (
    count_words,
    format_date,
    format_audio_filename
)

Functions

def count_words(
    text: str  # Text to count words in
) -> int:  # Word count
    "Count the number of whitespace-delimited words in text."
def format_date(
    created_at: str  # ISO date string, Unix timestamp, or similar
) -> str:  # Formatted date for display
    "Format a date string for human-readable display (e.g., 'Jan 20, 2026')."
def format_audio_filename(
    audio_path: str  # Full path to audio file
) -> str:  # Shortened filename for display
    "Extract and format the filename from a path."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_transcript_source_select-0.0.14.tar.gz (71.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cjm_transcript_source_select-0.0.14-py3-none-any.whl (65.1 kB view details)

Uploaded Python 3

File details

Details for the file cjm_transcript_source_select-0.0.14.tar.gz.

File metadata

File hashes

Hashes for cjm_transcript_source_select-0.0.14.tar.gz
Algorithm Hash digest
SHA256 8aba41a20b89d4474f5a4eb79d4751800c6cb326efba28d07e52773d6fe60a66
MD5 9c6fb5ac6e68e43b37e5a9a845f63f9e
BLAKE2b-256 19fb1524d3a9e95d93741f02fd8a67d6e9cc9f7817141d3bad2b5cbfec188da0

See more details on using hashes here.

File details

Details for the file cjm_transcript_source_select-0.0.14-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcript_source_select-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 50c99a0e3963ed9996f7b976ec4b99d7654d699a808a49e32fdfe8d4af4ad039
MD5 6d37164a990588f1ffe618f2dddeef73
BLAKE2b-256 194cc270cc2f057fb463d6654423557e488ff8e331091cf51f80f2354a95bf67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page