Skip to main content

Google Gemini API plugin for the cjm-transcription-plugin-system library - provides speech-to-text transcription with configurable model selection and parameter control.

Project description

cjm-transcription-plugin-gemini

Install

pip install cjm_transcription_plugin_gemini

Project Structure

nbs/
└── plugin.ipynb # Plugin implementation for Google Gemini API transcription

Total: 1 notebook across 1 directory

Module Dependencies

graph LR
    plugin[plugin<br/>Gemini Plugin]

No cross-module dependencies detected.

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

Gemini Plugin (plugin.ipynb)

Plugin implementation for Google Gemini API transcription

Import

from cjm_transcription_plugin_gemini.plugin import (
    GeminiPluginConfig,
    GeminiPlugin
)

Functions

@patch
def _get_api_key(
    self:GeminiPlugin
) -> str:  # The API key string
    "Get API key from config or environment."
@patch
def _refresh_available_models(
    self:GeminiPlugin
) -> List[str]:  # List of available model names
    "Fetch and filter available models from Gemini API."
@patch
def _update_max_tokens_for_model(
    self:GeminiPlugin,
    model_name: str  # Model name to update tokens for
) -> None
    "Update max_output_tokens config based on the model's token limit."
@patch
def update_config(
    self:GeminiPlugin,
    config: Union[Dict[str, Any], GeminiPluginConfig]  # New configuration values
) -> None
    "Update plugin configuration, adjusting max_tokens if model changes."
@patch
def _prepare_audio(
    self:GeminiPlugin,
    audio: Union[AudioData, str, Path]  # Audio data object or path to audio file
) -> Tuple[Path, bool]:  # Tuple of (processed audio path, whether temp file was created)
    "Prepare audio file for upload."
@patch
def _upload_audio_file(
    self:GeminiPlugin,
    audio_path: Path  # Path to audio file to upload
) -> Any:  # Uploaded file object
    "Upload audio file to Gemini API."
@patch
def _delete_uploaded_file(
    self:GeminiPlugin,
    file_name: str  # Name of file to delete
) -> None
    "Delete an uploaded file from Gemini API."
@patch
def cleanup(
    self:GeminiPlugin
) -> None
    "Clean up resources."
@patch
def get_available_models(
    self:GeminiPlugin
) -> List[str]:  # List of available model names
    "Get list of available audio-capable models."
@patch
def get_model_info(
    self:GeminiPlugin,
    model_name: Optional[str] = None  # Model name to get info for, defaults to current model
) -> Dict[str, Any]:  # Dict with model information
    "Get information about a specific model including token limits."
@patch
def supports_streaming(
    self:GeminiPlugin
) -> bool:  # True if streaming is supported
    "Check if this plugin supports streaming transcription."
@patch
def execute_stream(
    self:GeminiPlugin,
    audio: Union[AudioData, str, Path],  # Audio data object or path to audio file
    **kwargs  # Additional arguments to override config
) -> Generator[str, None, TranscriptionResult]:  # Yields text chunks, returns final result
    "Stream transcription results chunk by chunk."

Classes

@dataclass
class GeminiPluginConfig:
    "Configuration for Gemini transcription plugin."
    
    model: str = field(...)
    api_key: Optional[str] = field(...)
    prompt: str = field(...)
    temperature: float = field(...)
    top_p: float = field(...)
    max_output_tokens: int = field(...)
    seed: Optional[int] = field(...)
    response_mime_type: str = field(...)
    downsample_audio: bool = field(...)
    downsample_rate: int = field(...)
    downsample_channels: int = field(...)
    safety_settings: str = field(...)
    auto_refresh_models: bool = field(...)
    model_filter: List[str] = field(...)
    use_file_upload: bool = field(...)
    use_streaming: bool = field(...)
    delete_uploaded_files: bool = field(...)
class GeminiPlugin:
    def __init__(self):
        """Initialize the Gemini plugin with default configuration."""
        self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
        self.config: GeminiPluginConfig = None
    "Google Gemini API transcription plugin."
    
    def __init__(self):
            """Initialize the Gemini plugin with default configuration."""
            self.logger = logging.getLogger(f"{__name__}.{type(self).__name__}")
            self.config: GeminiPluginConfig = None
        "Initialize the Gemini plugin with default configuration."
    
    def name(
            self
        ) -> str:  # Plugin name identifier
        "Return the plugin name identifier."
    
    def version(
            self
        ) -> str:  # Plugin version string
        "Return the plugin version string."
    
    def supported_formats(
            self
        ) -> List[str]:  # List of supported audio formats
        "Return list of supported audio file formats."
    
    def get_current_config(
            self
        ) -> GeminiPluginConfig:  # Current configuration dataclass
        "Return current configuration."
    
    def initialize(
            self,
            config: Optional[Any] = None  # Configuration dataclass, dict, or None
        ) -> None
        "Initialize the plugin with configuration."
    
    def execute(
            self,
            audio: Union[AudioData, str, Path],  # Audio data object or path to audio file
            **kwargs  # Additional arguments to override config
        ) -> TranscriptionResult:  # Transcription result object
        "Transcribe audio using Gemini."
    
    def is_available(
            self
        ) -> bool:  # True if the Gemini API is available
        "Check if Gemini API is available."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_transcription_plugin_gemini-0.0.6.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cjm_transcription_plugin_gemini-0.0.6-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file cjm_transcription_plugin_gemini-0.0.6.tar.gz.

File metadata

File hashes

Hashes for cjm_transcription_plugin_gemini-0.0.6.tar.gz
Algorithm Hash digest
SHA256 ff90a0b8421b9e065f9d0b10443d2594c30b77abb4841fe5304db656c0b60603
MD5 3f3edcd2ebb0971e56a6c38c130adcea
BLAKE2b-256 dc512742498a717bc1576d8e9d5876179d975b7644353de1503193724f9a40be

See more details on using hashes here.

File details

Details for the file cjm_transcription_plugin_gemini-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_transcription_plugin_gemini-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ffe8d381ed8c18d9e69597b6bf57f716f5d54e7a12930fdefdb25c938733094b
MD5 dda4cb56d7da909962056e4fe3a146be
BLAKE2b-256 54e04cbb6c286c2b14ba3595058b14792da4432646894016249168420b342e41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page