Skip to main content

Google Gemini API plugin for the cjm-transcription-plugin-system library - provides speech-to-text transcription with configurable model selection and parameter control.

Project description

cjm-transcription-plugin-gemini

Install

pip install cjm_transcription_plugin_gemini

Project Structure

nbs/
└── plugin.ipynb # Plugin implementation for Google Gemini API transcription

Total: 1 notebook across 1 directory

Module Dependencies

graph LR
    plugin[plugin<br/>Gemini Plugin]

No cross-module dependencies detected.

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Gemini Plugin (plugin.ipynb)

Plugin implementation for Google Gemini API transcription

Import

from cjm_transcription_plugin_gemini.plugin import (
    GeminiPluginConfig,
    GeminiPlugin
)

Functions

@patch
def _get_api_key(
    self:GeminiPlugin
) -> str:  # The API key string
    "Get API key from config or environment."
@patch
def _refresh_available_models(
    self:GeminiPlugin
) -> List[str]:  # List of available model names
    "Fetch and filter available models from Gemini API."
@patch
def _update_max_tokens_for_model(
    self:GeminiPlugin,
    model_name: str  # Model name to update tokens for
) -> None
    "Update max_output_tokens config based on the model's token limit."
@patch
def update_config(
    self:GeminiPlugin,
    config: Union[Dict[str, Any], GeminiPluginConfig]  # New configuration values
) -> None
    "Update plugin configuration, adjusting max_tokens if model changes."
@patch
def _prepare_audio(
    self:GeminiPlugin,
    audio: Union[AudioData, str, Path]  # Audio data object or path to audio file
) -> Tuple[Path, bool]:  # Tuple of (processed audio path, whether temp file was created)
    "Prepare audio file for upload."
@patch
def _upload_audio_file(
    self:GeminiPlugin,
    audio_path: Path  # Path to audio file to upload
) -> Any:  # Uploaded file object
    "Upload audio file to Gemini API."
@patch
def _delete_uploaded_file(
    self:GeminiPlugin,
    file_name: str  # Name of file to delete
) -> None
    "Delete an uploaded file from Gemini API."
@patch
def cleanup(
    self:GeminiPlugin
) -> None
    "Clean up resources."
@patch
def get_available_models(
    self:GeminiPlugin
) -> List[str]:  # List of available model names
    "Get list of available audio-capable models."
@patch
def get_model_info(
    self:GeminiPlugin,
    model_name: Optional[str] = None  # Model name to get info for, defaults to current model
) -> Dict[str, Any]:  # Dict with model information
    "Get information about a specific model including token limits."
@patch
def supports_streaming(
    self:GeminiPlugin
) -> bool:  # True if streaming is supported
    "Check if this plugin supports streaming transcription."
@patch
def execute_stream(
    self:GeminiPlugin,
    audio: Union[AudioData, str, Path],  # Audio data object or path to audio file
    **kwargs  # Additional arguments to override config
) -> Generator[str, None, TranscriptionResult]:  # Yields text chunks, returns final result
    "Stream transcription results chunk by chunk."

Classes

@dataclass
class GeminiPluginConfig:
    "Configuration for Gemini transcription plugin."
    
    model: str = field(...)
    api_key: Optional[str] = field(...)
    prompt: str = field(...)
    temperature: float = field(...)
    top_p: float = field(...)
    max_output_tokens: int = field(...)
    seed: Optional[int] = field(...)
    response_mime_type: str = field(...)
    downsample_audio: bool = field(...)
    downsample_rate: int = field(...)
    downsample_channels: int = field(...)
    safety_settings: str = field(...)
    auto_refresh_models: bool = field(...)
    model_filter: List[str] = field(...)
    use_file_upload: bool = field(...)
    use_streaming: bool = field(...)
    delete_uploaded_files: bool = field(...)
class GeminiPlugin:
    def __init__(self):
        """Initialize the Gemini plugin with default configuration."""
        self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
        self.config: GeminiPluginConfig = None
    "Google Gemini API transcription plugin."
    
    def __init__(self):
            """Initialize the Gemini plugin with default configuration."""
            self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
            self.config: GeminiPluginConfig = None
        "Initialize the Gemini plugin with default configuration."
    
    def name(
            self
        ) -> str:  # Plugin name identifier
        "Return the plugin name identifier."
    
    def version(
            self
        ) -> str:  # Plugin version string
        "Return the plugin version string."
    
    def supported_formats(
            self
        ) -> List[str]:  # List of supported audio formats
        "Return list of supported audio file formats."
    
    def get_current_config(
            self
        ) -> GeminiPluginConfig:  # Current configuration dataclass
        "Return current configuration."
    
    def get_config_dataclass() -> GeminiPluginConfig: # Configuration dataclass
            """Return dataclass describing the plugin's configuration options."""
            return GeminiPluginConfig
        
        def initialize(
            self,
            config: Optional[Any] = None  # Configuration dataclass, dict, or None
        ) -> None
        "Return dataclass describing the plugin's configuration options."
    
    def initialize(
            self,
            config: Optional[Any] = None  # Configuration dataclass, dict, or None
        ) -> None
        "Initialize the plugin with configuration."
    
    def execute(
            self,
            audio: Union[AudioData, str, Path],  # Audio data object or path to audio file
            **kwargs  # Additional arguments to override config
        ) -> TranscriptionResult:  # Transcription result object
        "Transcribe audio using Gemini."
    
    def is_available(
            self
        ) -> bool:  # True if the Gemini API is available
        "Check if Gemini API is available."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_transcription_plugin_gemini-0.0.7.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cjm_transcription_plugin_gemini-0.0.7-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file cjm_transcription_plugin_gemini-0.0.7.tar.gz.

File metadata

File hashes

Hashes for cjm_transcription_plugin_gemini-0.0.7.tar.gz
Algorithm Hash digest
SHA256 53a145418f88f560d965367e1d237c7ca9a88a6f0a7b5b47e78e62c64179e86c
MD5 9ad0f4cef29cfa284bc28e854fa8a33e
BLAKE2b-256 e2b4f148584366cb83ccb8dcd542d6b147376513642abc8d4e4fe78937cd5189

See more details on using hashes here.

File details

Details for the file cjm_transcription_plugin_gemini-0.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcription_plugin_gemini-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2fe0411b6031b546ecf017c14b8c9fa130b72cd8a7e850cb7cd00d5d09ec21ac
MD5 44f3ec93f5f36c64288efe1bae4de6e4
BLAKE2b-256 888f580132778240992fb8a45218743dcb63f54dae9138bcff58352a241af6c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page