FastHTML source selection component for transcript decomposition workflows, with federated database browsing, drag-drop ordering, and keyboard navigation.
Project description
cjm-transcript-source-select
Install
pip install cjm_transcript_source_select
Project Structure
nbs/
├── components/ (6)
│ ├── helpers.ipynb # Shared helper functions for the selection module
│ ├── local_files.ipynb # Local files browser for importing external .db files
│ ├── preview_panel.ipynb # Collapsible preview panel for displaying selected content
│ ├── selection_queue.ipynb # Selection queue component with drag-drop reordering
│ ├── source_browser.ipynb # Source browser components for displaying and filtering transcription sources
│ └── step_renderer.ipynb # Phase 1 step renderer: Source Selection & Ordering with two-column layout and collapsible preview
├── routes/ (7)
│ ├── core.ipynb # Selection step state management helpers
│ ├── filtering.ipynb # Filtering, grouping, and keyboard navigation route handlers
│ ├── init.ipynb # Router assembly for Phase 1 selection routes
│ ├── local_files.ipynb # Local files browser route handlers
│ ├── queue.ipynb # Selection queue route handlers for Phase 1
│ ├── source_browser.ipynb # Source browser virtual collection router for Phase 1 selection
│ └── tabs.ipynb # Tab switching route handlers
├── services/ (2)
│ ├── source.ipynb # Source service for federated transcription queries via DuckDB
│ └── source_utils.ipynb # Source record operations for metadata extraction, grouping, and validation
├── html_ids.ipynb # HTML ID constants for Phase 1: Source Selection & Ordering
├── models.ipynb # Data models and URL bundles for Phase 1: Source Selection & Ordering
└── utils.ipynb # Display formatting and word counting utilities for the selection step
Total: 18 notebooks across 3 directories
Module Dependencies
graph LR
components_helpers[components.helpers<br/>helpers]
components_local_files[components.local_files<br/>local_files]
components_preview_panel[components.preview_panel<br/>preview_panel]
components_selection_queue[components.selection_queue<br/>selection_queue]
components_source_browser[components.source_browser<br/>source_browser]
components_step_renderer[components.step_renderer<br/>step_renderer]
html_ids[html_ids<br/>html_ids]
models[models<br/>models]
routes_core[routes.core<br/>core]
routes_filtering[routes.filtering<br/>filtering]
routes_init[routes.init<br/>init]
routes_local_files[routes.local_files<br/>local_files]
routes_queue[routes.queue<br/>queue]
routes_source_browser[routes.source_browser<br/>source_browser]
routes_tabs[routes.tabs<br/>tabs]
services_source[services.source<br/>source]
services_source_utils[services.source_utils<br/>source_utils]
utils[utils<br/>utils]
components_helpers --> models
components_local_files --> components_helpers
components_local_files --> html_ids
components_preview_panel --> html_ids
components_source_browser --> services_source_utils
components_source_browser --> html_ids
components_source_browser --> utils
components_step_renderer --> components_selection_queue
components_step_renderer --> models
components_step_renderer --> components_preview_panel
components_step_renderer --> utils
components_step_renderer --> components_local_files
components_step_renderer --> html_ids
components_step_renderer --> components_source_browser
routes_core --> html_ids
routes_core --> components_selection_queue
routes_core --> services_source
routes_core --> components_step_renderer
routes_core --> models
routes_filtering --> routes_core
routes_filtering --> services_source_utils
routes_filtering --> services_source
routes_filtering --> models
routes_init --> routes_core
routes_init --> models
routes_init --> routes_queue
routes_init --> routes_filtering
routes_init --> routes_source_browser
routes_init --> routes_local_files
routes_init --> services_source
routes_init --> routes_tabs
routes_local_files --> components_local_files
routes_local_files --> routes_core
routes_local_files --> models
routes_local_files --> services_source
routes_queue --> services_source_utils
routes_queue --> routes_core
routes_queue --> components_preview_panel
routes_queue --> services_source
routes_queue --> models
routes_source_browser --> components_source_browser
routes_source_browser --> services_source_utils
routes_source_browser --> routes_core
routes_source_browser --> components_preview_panel
routes_source_browser --> html_ids
routes_source_browser --> services_source
routes_source_browser --> models
routes_tabs --> services_source
routes_tabs --> routes_core
routes_tabs --> services_source_utils
routes_tabs --> components_step_renderer
routes_tabs --> models
52 cross-module dependencies detected
CLI Reference
No CLI commands found in this project.
Module Overview
Detailed documentation for each module in the project:
core (core.ipynb)
Selection step state management helpers
Import
from cjm_transcript_source_select.routes.core import (
DEBUG_SELECTION_STATE,
WorkflowStateStore
)
Functions
def _get_step_state(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
session_id: str # Session identifier string
) -> Dict[str, Any]: # Step state dictionary
"Get the selection step state from the workflow state store."
def _find_duplicate_media_source(
source_service: SourceService, # Source service for lookups
record_id: str, # Candidate record ID
provider_id: str, # Candidate provider ID
selected_sources: List[Dict[str, str]], # Current selections
) -> Optional[Dict[str, str]]: # Conflicting source dict or None
"Find an already-selected source that shares the same audio file."
def _render_duplicate_flash(
candidate_row_id: str, # DOM element ID of the candidate row
existing_row_id: Optional[str] = None, # DOM element ID of the conflicting row (None if off-screen)
) -> Div: # OOB Div with flash script
"Render a flash animation on one or two rows to indicate duplicate rejection."
def _get_active_source_tab(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
session_id: str # Session identifier string
) -> str: # Active tab: "db" or "files"
"Get the currently active source tab from workflow state."
def _build_queue_response(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for querying transcriptions
session_id: str, # Session identifier string
selected_sources: List[Dict[str, str]], # Current selected sources after mutation
urls: SelectionUrls, # URL bundle for rendering
include_stats: bool = True, # Include OOB stats swap
include_checkbox_oobs: bool = True, # Include OOB checkbox cells for visible rows
) -> Union[Any, Tuple]: # Single component or tuple of components with OOB swaps
"Build the standard response for queue-mutating handlers."
def _update_step_state(
"Update the selection step state in the workflow state store."
Variables
DEBUG_SELECTION_STATE = False
_rebuild_and_render_ref: list
_sync_items_ref: list
_get_checkbox_oobs_ref: list
_get_checkbox_oob_for_ref: list
_get_vc_row_id_for_ref: list
_activate_toggle_ref: list
filtering (filtering.ipynb)
Filtering, grouping, and keyboard navigation route handlers
Import
from cjm_transcript_source_select.routes.filtering import (
init_filtering_router
)
Functions
def _handle_source_filter(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
search: str, # Search term from input
urls: SelectionUrls, # URL bundle for rendering
): # VC content wrapper (direct swap, not OOB)
"Filter transcription sources by search term."
def _handle_grouping_change(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
grouping_mode: str, # New grouping mode: "media_path" or "batch_id"
urls: SelectionUrls, # URL bundle for rendering
): # VC content wrapper (direct swap, not OOB)
"Change the grouping mode and re-render the VC content."
def _handle_selection_toggle_focused(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Job ID from focused row (via hx-include)
provider_id: str, # Plugin name from focused row (via hx-include)
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats, optionally with OOB source list
"Toggle selection of the focused row (keyboard shortcut handler)."
def _handle_keyboard_reorder(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Record ID of item to move
provider_id: str, # Provider ID of item to move
direction: str, # Direction to move: "up" or "down"
urls: SelectionUrls, # URL bundle for rendering
): # Queue component, optionally with OOB source list
"Move an item up or down in the selection queue via keyboard."
def init_filtering_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
prefix: str, # Route prefix (e.g., "/workflow/selection/filtering")
urls: SelectionUrls, # URL bundle for rendering
) -> Tuple[APIRouter, Dict[str, Callable]]: # (router, route_dict)
"Initialize filtering and keyboard navigation routes."
helpers (helpers.ipynb)
Shared helper functions for the selection module
Import
from cjm_transcript_source_select.components.helpers import *
Functions
def _get_selection_state(
ctx: InteractionContext # Interaction context with state
) -> SelectionStepState: # Typed selection step state
"Get the full selection step state from context."
def _get_selected_sources(
ctx: InteractionContext # Interaction context with state
) -> List[SelectedSource]: # List of selected source dicts
"Get the list of selected sources from step state."
def _get_grouping_mode(
ctx: InteractionContext # Interaction context with state
) -> str: # Grouping mode: "media_path" or "batch_id"
"Get the current grouping mode from step state."
html_ids (html_ids.ipynb)
HTML ID constants for Phase 1: Source Selection & Ordering
Import
from cjm_transcript_source_select.html_ids import (
SelectionHtmlIds
)
Classes
class SelectionHtmlIds:
"HTML ID constants for Phase 1: Source Selection & Ordering."
def as_selector(
id_str:str # The HTML ID to convert
) -> str: # CSS selector with # prefix
"Convert an ID to a CSS selector format."
def source_checkbox(
record_id:str, # Record identifier
provider_id:str # Provider identifier
) -> str: # HTML ID for the source checkbox
"Generate HTML ID for a source selection checkbox."
def source_row(
record_id:str, # Record identifier
provider_id:str # Provider identifier
) -> str: # HTML ID for the source row
"Generate HTML ID for a source browser row."
def queue_item(
record_id:str, # Record identifier
provider_id:str # Provider identifier
) -> str: # HTML ID for the queue item
"Generate HTML ID for a queue item."
init (init.ipynb)
Router assembly for Phase 1 selection routes
Import
from cjm_transcript_source_select.routes.init import (
init_selection_routers
)
Functions
def init_selection_routers(
state_store: WorkflowStateStore, # The workflow state store
source_service: SourceService, # The source service for queries
workflow_id: str, # The workflow identifier
prefix: str, # Base prefix for selection routes (e.g., "/workflow/selection")
) -> SelectionResult: # Selection router result with routers, urls, routes, and restore
"Initialize and return all selection routers with URL bundle."
local_files (local_files.ipynb)
Local files browser for importing external .db files
Import
from cjm_transcript_source_select.components.local_files import *
Functions
def _get_external_db_paths(
ctx: InteractionContext # Interaction context with state
) -> List[str]: # List of external database paths
"Get the list of external database paths from step state."
def _get_current_browse_path(
ctx: InteractionContext # Interaction context with state
) -> str: # Current browse path
"Get the current browse path from step state."
def _get_file_browser_state(
step_state: Dict[str, Any], # Selection step state dictionary
default_path: Optional[str] = None # Default path if no state exists
) -> BrowserState: # BrowserState for file browser
"Get or create BrowserState from step state."
def _create_db_browser_config() -> FileBrowserConfig: # Configured FileBrowserConfig for .db file selection
"Create file browser config for .db file selection."
def _render_external_sources_list(
external_paths: List[str], # List of added external database paths
remove_url: str, # URL for removing external source
oob: bool = False, # Whether to render as OOB swap
) -> Any: # External sources section component (always rendered for OOB targeting)
"Render the list of added external database sources with scrollable paths."
def _render_error_alert(
error_message: Optional[str] = None, # Error message to display (None = clear)
oob: bool = False, # Whether to render as OOB swap
) -> Any: # Error alert container (always present for OOB targeting)
"Render the error alert container for the local files browser."
def _render_local_files_browser(
render_fn: Optional[Callable] = None, # FileBrowserRouters.render callable
external_paths: Optional[List[str]] = None, # List of added external database paths
remove_url: str = "", # URL for removing external source
error_message: Optional[str] = None, # Error message to display
) -> Any: # Local files browser component
"Render the local files browser for adding external .db files."
local_files (local_files.ipynb)
Local files browser route handlers
Import
from cjm_transcript_source_select.routes.local_files import (
init_local_files_router
)
Functions
def _get_local_files_provider() -> LocalFileSystemProvider:
"""Get or create the local files provider singleton."""
global _local_files_provider
if _local_files_provider is None
"Get or create the local files provider singleton."
def _handle_remove_external_source(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for external db ops
sess, # FastHTML session object
db_path: str, # Path to the .db file to remove
external_db_paths_ref: List[str], # Shared external paths list (mutated in place)
fb_routers: FileBrowserRouters, # File browser routers (for targeted OOB)
remove_url: str, # URL for remove button in external sources list
urls: SelectionUrls, # Full URL bundle for queue re-rendering
): # Tuple of OOB elements (external sources list + checkbox cells + queue + stats)
"Remove an external database source and clean up orphaned queue items."
def init_local_files_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for external db ops
prefix: str, # Route prefix (e.g., "/workflow/selection/local_files")
urls: SelectionUrls, # URL bundle for rendering
) -> LocalFilesResult: # Router result with routers, routes, render, restore, and reset
"Initialize local files browser routes with new file browser API."
Variables
_local_files_provider: Optional[LocalFileSystemProvider] = None
models (models.ipynb)
Data models and URL bundles for Phase 1: Source Selection & Ordering
Import
from cjm_transcript_source_select.models import (
SelectionStepState,
SelectionUrls,
LocalFilesResult,
SelectionResult
)
Functions
def _no_op_restore(session_id: str) -> None:
"""Default no-op for restore_state."""
pass
def _no_op_reset() -> None
"Default no-op for restore_state."
def _no_op_reset() -> None:
"""Default no-op for reset_state."""
pass
@dataclass
class LocalFilesResult
"Default no-op for reset_state."
Classes
class SelectionStepState(TypedDict):
"State for Phase 1: Source Selection & Ordering."
@dataclass
class SelectionUrls:
"URL bundle for Phase 1 selection route handlers and renderers."
add: str = '' # Add source to queue
remove: str = '' # Remove source from queue
toggle: str = '' # Toggle source selection (add/remove based on current state)
reorder: str = '' # Reorder queue items
clear: str = '' # Clear all from queue
select_all: str = '' # Select all in a group
preview: str = '' # Preview source content
toggle_focused: str = '' # Toggle focused row selection
keyboard_reorder: str = '' # Keyboard reorder (Shift+Up/Down)
filter: str = '' # Filter source list
grouping_change: str = '' # Change grouping mode
browse_directory: str = '' # Browse directory
add_external: str = '' # Add external .db source
remove_external: str = '' # Remove external .db source
tab_switch: str = '' # Switch source tabs
@dataclass
class LocalFilesResult:
"Return type from init_local_files_router."
routers: List[APIRouter] # Routers to register (custom + file browser + VC)
routes: Dict[str, Callable] # Named route handlers
render_panel: Callable # (error_message?, session_id?) -> rendered panel
restore_state: Callable = field(...) # (session_id) -> None, restore persisted state
reset_state: Callable = field(...) # () -> None, reset in-memory caches
@dataclass
class SelectionResult:
"Return type from init_selection_routers."
routers: List[APIRouter] # All selection routers to register
urls: 'SelectionUrls' = field(...) # URL bundle
routes: Dict[str, Callable] = field(...) # All named route handlers
render_local_files_panel: Optional[Callable] # Render fn for local files tab
sb_state: Any # SourceBrowserRouterState
restore_state: Callable = field(...) # (session_id) -> None, restore persisted state
reset_state: Callable = field(...) # () -> None, reset in-memory caches
preview_panel (preview_panel.ipynb)
Collapsible preview panel for displaying selected content
Import
from cjm_transcript_source_select.components.preview_panel import *
Functions
def _render_preview_panel(
preview_record_id: Optional[str] = None, # Job ID being previewed
preview_text: Optional[str] = None, # Text content to preview
is_open: bool = False, # Whether the collapse should be open
) -> Any: # Preview panel component (collapsible, full-width)
"Render the collapsible preview panel for displaying selected content."
queue (queue.ipynb)
Selection queue route handlers for Phase 1
Import
from cjm_transcript_source_select.routes.queue import (
init_queue_router
)
Functions
def _handle_selection_toggle(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Job ID to toggle
provider_id: str, # Plugin name for the source
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats (no checkbox OOBs -- checkbox already correct)
"Toggle a source's selection state (add if absent, remove if present)."
def _handle_selection_add(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
record_id: str, # Job ID to add
provider_id: str, # Plugin name for the source
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats and visible checkbox OOBs
"Add a source to the selection queue."
def _handle_selection_remove(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
key: str, # Item key (record_id) to remove
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats and visible checkbox OOBs
"Remove a source from the selection queue by key."
async def _handle_selection_reorder(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
urls: SelectionUrls, # URL bundle for rendering
): # Updated queue component
"Reorder items in the selection queue based on SortableJS result."
def _handle_selection_clear(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats, optionally with OOB source list
"Clear all items from the selection queue."
def _handle_selection_select_all(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
group_key: str, # Group key to select all transcriptions for
grouping_mode: str, # Current grouping mode: "media_path" or "batch_id"
urls: SelectionUrls, # URL bundle for rendering
): # Queue component with OOB stats, optionally with OOB source list
"Select all transcriptions for a given group, skipping duplicate audio sources."
def _handle_selection_preview(
source_service: SourceService, # The source service for queries
request, # FastHTML request object
record_id: str, # Job ID to preview
provider_id: str, # Plugin name for the source
): # Full preview panel component (collapsible, open with content)
"Get preview panel for a selected source."
def init_queue_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
prefix: str, # Route prefix (e.g., "/workflow/selection/queue")
urls: SelectionUrls, # URL bundle for rendering (populated after all routers created)
) -> Tuple[APIRouter, Dict[str, Callable]]: # (router, route_dict)
"Initialize queue management routes."
selection_queue (selection_queue.ipynb)
Selection queue component with drag-drop reordering
Import
from cjm_transcript_source_select.components.selection_queue import (
SD_QUEUE_PREFIX,
SD_QUEUE_CONFIG,
SD_QUEUE_IDS
)
Functions
def _render_queue_content(
item: dict, # Source dict with record_id and provider_id
index: int, # 0-based position in queue
) -> Any: # Custom content for the queue item
"Render the job ID display as custom content for each queue item."
def _render_queue_empty() -> Any: # Empty state element
"Render the custom empty state for the source selection queue."
def _render_selection_queue(
selected_sources: List[Dict[str, str]], # List of selected sources in order
remove_url: str, # URL for removing from queue
reorder_url: str, # URL for reordering queue
clear_url: str, # URL for clearing all
) -> Any: # Queue panel component
"Render the selection queue panel via cjm-fasthtml-sortable-queue."
Variables
SD_QUEUE_PREFIX = 'sd'
SD_QUEUE_CONFIG
SD_QUEUE_IDS
source (source.ipynb)
Source service for federated transcription queries via DuckDB
Import
from cjm_transcript_source_select.services.source import (
VALID_DB_EXTENSIONS,
TranscriptionDBProvider,
SourceService,
validate_and_toggle_external_db
)
Functions
def validate_and_toggle_external_db(
source_service: SourceService, # Source service for duplicate detection
path: str, # Path to the .db file
external_paths: List[str], # Current external database paths
valid_extensions: List[str] = None, # Valid file extensions (default: VALID_DB_EXTENSIONS)
) -> Tuple[List[str], Optional[str]]: # (updated_paths, error_message or None)
"Validate and toggle an external database path in the external paths list."
Classes
class TranscriptionDBProvider:
def __init__(
self,
db_path: str, # Path to SQLite database file
name: str, # Display name for this provider
provider_id: Optional[str] = None # Unique ID (defaults to db_path)
)
"SourceProvider for transcription SQLite databases."
def __init__(
self,
db_path: str, # Path to SQLite database file
name: str, # Display name for this provider
provider_id: Optional[str] = None # Unique ID (defaults to db_path)
)
"Initialize provider for a transcription database."
def provider_id(self) -> str: # Unique identifier
"""Unique identifier for this provider instance."""
return self._id
@property
def provider_name(self) -> str: # Display name
"Unique identifier for this provider instance."
def provider_name(self) -> str: # Display name
"""Human-readable name for display."""
return self._name
@property
def provider_type(self) -> str: # Provider category
"Human-readable name for display."
def provider_type(self) -> str: # Provider category
"""Provider type category."""
return "transcription_db"
@property
def db_path(self) -> Path: # Database file path
"Provider type category."
def db_path(self) -> Path: # Database file path
"""Path to the underlying database file."""
return self._db_path
def is_available(self) -> bool: # Whether database exists and is accessible
"Path to the underlying database file."
def is_available(self) -> bool: # Whether database exists and is accessible
"""Check if the database file exists and is accessible."""
return self._db_path.exists() and self._db_path.suffix == '.db'
def validate_schema(self) -> Tuple[bool, str]: # (is_valid, error_message)
"Check if the database file exists and is accessible."
def validate_schema(self) -> Tuple[bool, str]: # (is_valid, error_message)
"""Check if database has valid transcription schema."""
if not self.is_available()
"Check if database has valid transcription schema."
def query_records(
self,
limit: int = 100 # Maximum records to return
) -> List[SourceRecord]: # List of source records
"Query transcription records from the database."
def get_source_block(
self,
record_id: str # Job ID to fetch
) -> Optional[SourceBlock]: # SourceBlock or None if not found
"Fetch a specific transcription as a SourceBlock."
def from_plugin(
cls,
meta: PluginMeta # Plugin metadata with manifest containing db_path
) -> Optional["TranscriptionDBProvider"]: # Provider or None if no valid db_path
"Create provider from plugin metadata."
def from_external_path(
cls,
path: str # Path to external database file
) -> Optional["TranscriptionDBProvider"]: # Provider or None if path invalid
"Create provider from an external database path."
class SourceService:
def __init__(
self,
plugin_manager: PluginManager, # Plugin manager for discovering plugin sources
source_categories: List[str] = None, # Plugin categories to query (default: ['transcription'])
external_paths: List[str] = None # External database paths
)
"Service for federated access to content sources via providers."
def __init__(
self,
plugin_manager: PluginManager, # Plugin manager for discovering plugin sources
source_categories: List[str] = None, # Plugin categories to query (default: ['transcription'])
external_paths: List[str] = None # External database paths
)
"Initialize the source service."
def add_provider(
self,
provider: SourceProvider # Provider instance to add
) -> bool: # True if added, False if ID already exists
"Add a source provider."
def remove_provider(
self,
provider_id: str # ID of provider to remove
) -> bool: # True if removed, False if not found
"Remove a source provider by ID."
def get_provider(
self,
provider_id: str # ID of provider to get
) -> Optional[SourceProvider]: # Provider or None if not found
"Get a provider by ID."
def get_providers(self) -> List[SourceProvider]: # List of all providers
"""Get all registered providers."""
return list(self._providers.values())
def get_provider_by_name(
self,
name: str # Provider name to search for
) -> Optional[SourceProvider]: # Provider or None if not found
"Get all registered providers."
def get_provider_by_name(
self,
name: str # Provider name to search for
) -> Optional[SourceProvider]: # Provider or None if not found
"Find a provider by its display name."
def has_provider_for_path(
self,
path: str # Path to check
) -> Tuple[bool, Optional[str]]: # (has_duplicate, existing_provider_name)
"Check if any provider uses the same resolved database path."
def add_plugin_providers(self) -> int: # Number of providers added
"""Discover and add providers from loaded plugins."""
added = 0
for category in self._categories
"Discover and add providers from loaded plugins."
def set_external_paths(
self,
paths: List[str] # List of external database paths to set
) -> None
"Set external database paths (replaces existing external providers)."
def add_external_path(
self,
path: str # External database path to add
) -> bool: # True if added, False if already exists or invalid
"Add an external database as a provider."
def remove_external_path(
self,
path: str # External database path to remove
) -> bool: # True if removed, False if not found
"Remove an external database provider."
def get_external_paths(self) -> List[str]: # List of external database paths
"""Get list of external database paths."""
paths = []
for pid, provider in self._providers.items()
"Get list of external database paths."
def get_available_sources(self) -> List[Dict[str, Any]]: # List of source info dicts
"""Get list of available sources (for UI display)."""
# First ensure plugin providers are loaded
self.add_plugin_providers()
sources = []
for provider in self._providers.values()
"Get list of available sources (for UI display)."
def query_transcriptions(
self,
provider_name: Optional[str] = None, # Filter by provider name (None for all)
limit: int = 100 # Maximum number of results per provider
) -> List[Dict[str, Any]]: # List of transcription records
"Query records from all providers (or a specific one)."
def get_transcription_by_id(
self,
record_id: str, # Record ID to fetch
provider_id: str # Provider ID that owns this record
) -> Optional[SourceBlock]: # SourceBlock or None if not found
"Get a specific transcription as a SourceBlock."
def get_source_blocks(
self,
selections: List[Dict[str, str]] # List of {record_id, provider_id} dicts
) -> List[SourceBlock]: # Ordered list of SourceBlocks
"Fetch multiple records as SourceBlocks in order."
Variables
VALID_DB_EXTENSIONS = [3 items]
source_browser (source_browser.ipynb)
Source browser components for displaying and filtering transcription sources
Import
from cjm_transcript_source_select.components.source_browser import (
SOURCE_BROWSER_COLUMNS,
SB_SYSTEM_ID,
SourceBrowserItem,
build_source_items,
is_source_item_skippable,
create_source_cell_renderer,
render_source_empty
)
Functions
def _render_grouping_selector(
grouping_mode: str, # Current grouping mode: "media_path" or "batch_id"
grouping_change_url: str, # URL for changing grouping mode
) -> Any: # Grouping selector component
"Render the dropdown for selecting grouping mode."
def build_source_items(
transcriptions: List[Dict[str, Any]], # Available transcription records
selected_sources: List[Dict[str, str]], # Currently selected sources
grouping_mode: str = "media_path", # Grouping mode: "media_path" or "batch_id"
) -> List[SourceBrowserItem]: # Flat list with interleaved headers and records
"Build the items list for the source browser virtual collection."
def is_source_item_skippable(
item: SourceBrowserItem, # Item to check
) -> bool: # True if item is a group header (cursor should skip)
"Predicate for virtual collection is_skippable parameter."
def _render_header_cell(
item: SourceBrowserItem, # Header item
ctx: CellRenderContext, # Cell render context
select_all_url: str = "", # URL for selecting all in group
) -> Any: # Cell content for a group header row
"Render cell content for a group header item."
def _render_record_cell(
item: SourceBrowserItem, # Record item
ctx: CellRenderContext, # Cell render context
toggle_url: str = "", # URL for toggling source selection
) -> Any: # Cell content for a data record row
"Render cell content for a data record item."
def create_source_cell_renderer(
toggle_url: str = "", # URL for toggling source selection
select_all_url: str = "", # URL for selecting all in a group
) -> Callable: # render_cell(item: SourceBrowserItem, ctx: CellRenderContext) -> Any
"Create a render_cell callback for the source browser virtual collection."
def render_source_empty() -> Any: # Empty state component
"Render empty state when no transcription sources are available."
def _render_source_browser_vc_content(
sb_state: Any, # SourceBrowserRouterState from routes.source_browser
) -> Any: # VC content wrapper (without search/grouping header)
"Render the VC content portion of the source browser."
def _render_source_browser_vc(
sb_state: Any, # SourceBrowserRouterState from routes.source_browser
filter_url: str = "", # URL for filtering sources
grouping_mode: str = "media_path", # Current grouping mode
grouping_change_url: str = "", # URL for changing grouping mode
) -> Any: # Source browser component with virtual collection
"Render the full source browser panel (header + VC content)."
Classes
@dataclass
class SourceBrowserItem:
"Item in the source browser virtual collection (header or record)."
item_type: str # "header" or "record"
group_key: str = '' # Group key (media_path or batch_id value)
group_display: str = '' # Formatted display text for group header
group_count: int = 0 # Number of records in this group
grouping_mode: str = '' # Grouping mode used ("media_path" or "batch_id")
record: Optional[Dict[str, Any]] # Original transcription record dict
is_selected: bool = False # Whether currently in queue
Variables
SOURCE_BROWSER_COLUMNS
_SB_CONTENT_ID = 'sb-content'
_SB_VC_WRAPPER_ID = 'sb-vc-wrapper'
SB_SYSTEM_ID = 'sb-collection'
source_browser (source_browser.ipynb)
Source browser virtual collection router for Phase 1 selection
Import
from cjm_transcript_source_select.routes.source_browser import (
SourceBrowserRouterState,
init_source_browser_router
)
Functions
def init_source_browser_router(
source_service: SourceService, # Source service for querying transcriptions
urls: SelectionUrls, # URL bundle (toggle, select_all, filter, grouping_change)
prefix: str = "/browser", # Route prefix for VC routes
) -> SourceBrowserRouterState: # Router state with all VC objects and helpers
"Initialize the source browser virtual collection router."
Classes
@dataclass
class SourceBrowserRouterState:
"Return value from init_source_browser_router."
router: APIRouter # VC routes (nav, focus, activate, sort, viewport)
urls: VirtualCollectionUrls # VC URL bundle
ids: VirtualCollectionHtmlIds # VC HTML IDs
btn_ids: VirtualCollectionButtonIds # VC keyboard button IDs
config: VirtualCollectionConfig # VC config
state: VirtualCollectionState # VC state (mutable)
items: List[SourceBrowserItem] # Shared items list (mutable)
render_cell: Callable # Cell render callback
rebuild_and_render: Callable # (transcriptions, selected_sources, grouping_mode, content_only) -> Div
rebuild_items: Callable # (transcriptions, selected_sources, grouping_mode) -> None
sync_items_selection: Callable # (selected_sources) -> None
get_visible_checkbox_oobs: Callable # () -> tuple of OOB elements
get_checkbox_oob_for: Callable # (record_id, provider_id) -> OOB element or None
get_vc_row_id_for: Callable # (record_id, provider_id) -> str or None
source_utils (source_utils.ipynb)
Source record operations for metadata extraction, grouping, and validation
Import
from cjm_transcript_source_select.services.source_utils import (
extract_batch_id,
extract_model_name,
group_transcriptions,
group_transcriptions_by_audio,
is_source_selected,
get_selected_media_paths,
filter_transcriptions,
select_all_in_group,
toggle_source_selection,
reorder_item,
reorder_sources,
calculate_next_tab,
check_audio_exists,
validate_browse_path
)
Functions
def extract_batch_id(
metadata: Any # Metadata dict or JSON string
) -> str: # Batch ID or "No Batch ID"
"Extract batch_id from transcription metadata."
def extract_model_name(
metadata: Any # Metadata dict or JSON string
) -> str: # Formatted model name for display
"Extract and format model name from transcription metadata."
def group_transcriptions(
transcriptions: List[Dict[str, Any]], # List of transcription records
group_by: str = "media_path" # Grouping mode: "media_path" or "batch_id"
) -> Dict[str, List[Dict[str, Any]]]: # Grouped transcriptions
"Group transcription records by the specified field."
def group_transcriptions_by_audio(
transcriptions: List[Dict[str, Any]] # List of transcription records
) -> Dict[str, List[Dict[str, Any]]]: # Grouped by media_path
"Group transcription records by their source audio file."
def is_source_selected(
record_id: str, # Job ID to check
provider_id: str, # Provider ID to check
selected_sources: List[Dict[str, str]] # List of selected sources
) -> bool: # True if source is selected
"Check if a source is in the selected list by (record_id, provider_id) pair."
def get_selected_media_paths(
selected_sources: List[Dict[str, str]], # Current selections (record_id, provider_id)
all_transcriptions: List[Dict[str, Any]], # All available transcription records
) -> Set[str]: # Media paths already represented in selections
"Get the set of media_paths for currently selected sources."
def filter_transcriptions(
transcriptions: List[Dict[str, Any]], # List of transcription records to filter
search_text: str, # Search term for case-insensitive substring matching
) -> List[Dict[str, Any]]: # Filtered transcription records
"Filter transcriptions by substring match across record_id, media_path, and text fields."
def select_all_in_group(
transcriptions: List[Dict[str, Any]], # All transcription records
group_key: str, # Group key to match against
grouping_mode: str, # Grouping mode: "media_path" or "batch_id"
selected_sources: List[Dict[str, str]], # Current selections
excluded_media_paths: Optional[Set[str]] = None, # Media paths to skip (already selected)
) -> List[Dict[str, str]]: # Updated selections with new items appended
"Add all transcriptions matching a group key to the selection list, skipping duplicates."
def toggle_source_selection(
record_id: str, # Job ID to toggle
provider_id: str, # Plugin name for the source
selected_sources: List[Dict[str, str]], # Current selections
) -> List[Dict[str, str]]: # Updated selections
"Toggle a source in or out of the selection list by (record_id, provider_id) pair."
def reorder_item(
selected_sources: List[Dict[str, str]], # Current selections
record_id: str, # Record ID of item to move
provider_id: str, # Provider ID of item to move
direction: str, # Direction: "up" or "down"
) -> List[Dict[str, str]]: # Reordered selections
"Move an item up or down in the selection list by swapping with its neighbor."
def reorder_sources(
selected_sources: List[Dict[str, str]], # Current selections
new_order_ids: List[str], # Job IDs in desired order
) -> List[Dict[str, str]]: # Reordered selections
"Reorder sources to match the given job ID order."
def calculate_next_tab(
direction: str, # Direction: "prev", "next", or a direct tab name
current_tab: str, # Currently active tab name
tabs: List[str], # Available tab names in order
) -> str: # New active tab name
"Calculate the next tab based on direction or direct selection."
def check_audio_exists(
media_path: str # Path to audio file
) -> bool: # True if file exists
"Check if the audio file exists at the given path."
def validate_browse_path(
path: str # Path to validate
) -> str: # Validated and resolved path, or home directory on error
"Validate a browse path for security. Returns home directory on invalid input."
step_renderer (step_renderer.ipynb)
Phase 1 step renderer: Source Selection & Ordering with two-column layout and collapsible preview
Import
from cjm_transcript_source_select.components.step_renderer import (
SD_TAB_PREV_BTN,
SD_TAB_NEXT_BTN,
SD_PREVIEW_BTN,
FB_SYSTEM_ID,
render_selection_step
)
Functions
def _create_parent_keyboard_manager() -> ZoneManager: # Parent keyboard manager for hierarchy
"Create the parent keyboard manager with two ghost zones for column switching."
def _render_selection_stats(
selected_sources: List[Dict[str, str]], # Selected sources
transcriptions: List[Dict[str, Any]], # All transcriptions (for word count)
oob: bool = False, # Whether to render as OOB swap
) -> Any: # Stats component
"Render the selection statistics (word count and source count)."
def _render_selection_footer(
selected_sources: List[Dict[str, str]], # Selected sources
transcriptions: List[Dict[str, Any]], # All transcriptions (for word count)
) -> Any: # Footer component
"Render the footer with statistics and continue button."
def _render_tab_headers(
active_tab: str, # Currently active tab ('db' or 'files')
tab_switch_url: str = "", # URL for switching tabs via HTMX
oob: bool = False, # Whether to render as OOB swap
) -> Any: # Tab headers container
"Render the tab header radio inputs."
def _render_source_tabs(
active_tab: str, # Currently active tab ('db' or 'files')
active_content: Any, # Content for the currently active tab
tab_switch_url: str = "", # URL for switching tabs via HTMX
) -> Any: # Tabs header + separate content container
"Render source type tabs with a single shared content container."
def _generate_hierarchy_js(
active_tab: str, # Active tab: "db" or "files"
) -> Script: # Script element with hierarchy wiring and activation logic
"Generate JavaScript for keyboard system hierarchy and child activation."
def render_selection_step(
sources: List[Dict[str, Any]], # Available source plugins
transcriptions: List[Dict[str, Any]], # Available transcription records
selected_sources: List[Dict[str, str]], # Ordered selection
grouping_mode: str, # Grouping mode: "media_path" or "batch_id"
active_tab: str, # Active tab: "db" or "files"
urls: SelectionUrls, # URL bundle for selection routes
render_local_files_panel: Optional[Callable] = None, # Render fn for Files tab content
sb_state: Any = None, # SourceBrowserRouterState for DB tab VC rendering
) -> Any: # FastHTML component
"Render Phase 1: Source Selection & Ordering step with two-column layout."
Variables
SD_TAB_PREV_BTN = 'sd-tab-prev-btn'
SD_TAB_NEXT_BTN = 'sd-tab-next-btn'
SD_PREVIEW_BTN = 'sd-preview-btn'
FB_SYSTEM_ID = 'lfb-collection'
_ZONE_FOCUS_CLASSES
_VIEWPORT_FIT_CONFIG
tabs (tabs.ipynb)
Tab switching route handlers
Import
from cjm_transcript_source_select.routes.tabs import (
init_tabs_router
)
Functions
def _handle_tab_switch(
source_service: SourceService, # The source service for queries
request, # FastHTML request object
sess, # FastHTML session object
direction: str, # Direction: "prev", "next", "db", or "files"
urls: SelectionUrls, # URL bundle for rendering
current_tab_ref: List[str], # Mutable ref [current_tab] for closure-based tracking
render_local_files_panel: Optional[Callable] = None, # Render fn for Files tab
sb_state: Any = None, # SourceBrowserRouterState for DB tab VC rendering
state_store: WorkflowStateStore = None, # State store (for reading step state)
workflow_id: str = "", # Workflow ID (for reading step state)
): # Tuple of inner content, OOB tab headers, and tab switch script
"Switch between Plugin DB and Local Files tabs."
def init_tabs_router(
state_store: WorkflowStateStore, # The workflow state store
workflow_id: str, # The workflow identifier
source_service: SourceService, # The source service for queries
prefix: str, # Route prefix (e.g., "/workflow/selection/tabs")
urls: SelectionUrls, # URL bundle for rendering
render_local_files_panel: Optional[Callable] = None, # Render fn for Files tab content
sb_state: Any = None, # SourceBrowserRouterState for DB tab VC rendering
) -> Tuple[APIRouter, Dict[str, Callable]]: # (router, route_dict)
"Initialize tab switching routes."
utils (utils.ipynb)
Display formatting and word counting utilities for the selection step
Import
from cjm_transcript_source_select.utils import (
count_words,
format_date,
format_audio_filename
)
Functions
def count_words(
text: str # Text to count words in
) -> int: # Word count
"Count the number of whitespace-delimited words in text."
def format_date(
created_at: str # ISO date string, Unix timestamp, or similar
) -> str: # Formatted date for display
"Format a date string for human-readable display (e.g., 'Jan 20, 2026')."
def format_audio_filename(
audio_path: str # Full path to audio file
) -> str: # Shortened filename for display
"Extract and format the filename from a path."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cjm_transcript_source_select-0.0.18.tar.gz.
File metadata
- Download URL: cjm_transcript_source_select-0.0.18.tar.gz
- Upload date:
- Size: 73.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b7adc3c7bfd17680590f73737f49a1e2e06f65a7e8eee019ea4fcea4970ecb9
|
|
| MD5 |
353d180e4a3579ec198f6941dda50fce
|
|
| BLAKE2b-256 |
3f46435bfe462a439d90e3c41669e295c7982eb82df22283c6f5aa51c28a736e
|
File details
Details for the file cjm_transcript_source_select-0.0.18-py3-none-any.whl.
File metadata
- Download URL: cjm_transcript_source_select-0.0.18-py3-none-any.whl
- Upload date:
- Size: 67.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6453892b77911c82fbcc939be73f52ea9cb50e592ca1042961ad08649f6c815a
|
|
| MD5 |
b4cabbd684b49e49d6b76a2ee10a0be1
|
|
| BLAKE2b-256 |
fb8906e40507565490b356f57f5a497d7fc823af7ac9741d393a15e46810a4df
|