Skip to main content

Google Gemini text and vision model wrappers for Autourgos

Project description

Autourgos Google Model Kit

Gemini

Pypi Release Framework Wrapper Developed%20by Documented%20by

Autourgos is a Python framework that helps developers build AI agents, from simple bots to advanced systems. It will contains tools, autonomous tools, memory, llm agents, and many more features to make it easy to create intelligent Agents.

The Autourgos Google Model Kit is a package that connects this framework to Google's Gemini AI models. It provides simple, ready-to-use wrappers for handling both text and vision (image-based) tasks. Features like automatic retries, input validation, and cost tracking are built-in, making it straightforward to add Google's powerful AI capabilities to your projects in a reliable way.

This package gives you two Model Wrappers for Google Gemini APIs:

  • GoogleTextModel for text generation
  • GoogleVisionModel for image + text prompts with text output

It focuses on clean API usage, validation, retries, and structured response metadata.

Table of Contents

Why Use This Package

  • Typed Model Enums: Safer model selection using built-in enums (GOOGLE_TEXT_MODEL_NAME, etc.).
  • Consistent API: One unified class interface (invoke) across both text and vision models.
  • Streaming Support: Optional real-time streaming mode for token-by-token generation.
  • Structured Output: Access response text alongside token usage and estimated cost metadata.
  • Prompt Templates: Reusable templates with strict variable validation.
  • Advanced Capabilities: Full support for Gemini 3 thinking levels and vision media resolution tuning.
  • Resilience: Built-in retry mechanism with exponential backoff and timeout configurations.
  • Flexible Configuration: API key resolution from explicit arguments or standard environment variables.

Installation and API Key Setup

Install the package:

pip install autourgos-google-modelkit

Set API key in environment variables: PowerShell:

$env:GOOGLE_API_KEY = "your-api-key"

Set API key in environment variables: Bash:

export GOOGLE_API_KEY="your-api-key"

Text generation (GoogleTextModel)

Basic usage

from autourgos_google_modelkit import GoogleTextModel, GOOGLE_TEXT_MODEL_NAME

llm = GoogleTextModel(
	model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO,
)

print(llm.invoke("Explain RAG in simple terms."))

Example response

RAG (Retrieval-Augmented Generation) combines search and generation.
The model first retrieves relevant knowledge, then writes an answer using that context.
This improves factual accuracy and reduces hallucinations.

Model Initialization and Configuration

Base Setup

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel()

Parameter: Model

Model selection can be done using enums or strings. Enums provide better safety and autocomplete, while strings offer flexibility.

Supported models include:

  • GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO (string: "gemini-3.1-pro")
  • GOOGLE_TEXT_MODEL_NAME.GEMINI_3_FLASH_PREVIEW (string: "gemini-3-flash-preview")
  • GOOGLE_TEXT_MODEL_NAME.GEMINI_2_5_FLASH (string: "gemini-2.5-flash")
  • GOOGLE_TEXT_MODEL_NAME.GEMINI_2_5_PRO (string: "gemini-2.5-pro")

Using enums (recommended) for better safety and autocomplete:

from autourgos_google_modelkit import GOOGLE_TEXT_MODEL_NAME, GoogleTextModel
llm = GoogleTextModel(
  model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO
  )
print(llm.invoke("Explain RAG in simple terms."))

Using strings:

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  model="gemini-3.1-pro"
  )
print(llm.invoke("Explain Agentic AI in simple terms."))

Parameter: API Key

Explicitly setting the API key:

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  api_key="AIzaSy..."
  )

Relying on environment variable (recommended):

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel()

Parameter: Prompt Template

Setting a reusable prompt template with variables

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  prompt_template="Explain {topic} in simple terms."
  )
print(llm.invoke(prompt_variables={"topic": "RAG"}))

Parameter: Temperature

Temperature controls randomness in generation. Google suggests values between 0.0 and 2.0.

  • Lower values (e.g., 0.0) produce more deterministic output.
  • Moderate values (e.g., 1.0) balance coherence and creativity.
  • Higher values (e.g., 2.0) produce more creative and varied output.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  temperature=0.7
  )

Parameter: Top P

Top-p (nucleus sampling) controls diversity by limiting token selection to a cumulative probability threshold. Valid values are between 0.0 and 1.0.

  • Lower values (e.g., 0.0) produce more deterministic output.
  • Moderate values (e.g., 0.9) allow for more diversity while maintaining coherence.
  • Higher values (e.g., 1.0) produce more diverse and creative output.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  top_p=0.9
  )

Parameter: Top K

Top-k (top-k sampling) limits token selection to the top-k most likely tokens. Valid values are between 1 and 40.

  • Lower values (e.g., 1) produce more deterministic output.
  • Moderate values (e.g., 20) allow for more diversity while maintaining coherence.
  • Higher values (e.g., 40) produce more diverse and creative output.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  top_k=40
  )

Parameter: Max Tokens

Max tokens sets the maximum output token budget for the response.

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  max_tokens=1024
  )

Parameter: Thinking Level

Thinking level controls the depth of reasoning for supported Gemini models. Valid values are:

  • GOOGLE_TEXT_THINKING_LEVEL.LOW
  • GOOGLE_TEXT_THINKING_LEVEL.MEDIUM
  • GOOGLE_TEXT_THINKING_LEVEL.HIGH
from autourgos_google_modelkit import GoogleTextModel, GOOGLE_TEXT_THINKING_LEVEL
llm = GoogleTextModel(
  thinking_level=GOOGLE_TEXT_THINKING_LEVEL.HIGH
  )

Note: Higher thinking levels may improve reasoning quality but can also increase latency and cost.

Note: The thinking_level parameter is only supported by Gemini 3.1 Pro, Gemini 3 Flash Preview models and Gemini 3.1 Flash Lite models. Using it with unsupported models will raise a validation error.

Parameter: Structured Output

Setting structured_output=True returns a dictionary with response text and metadata instead of plain text.

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  structured_output=True
  )
result = llm.invoke("Summarize observability in one paragraph.")
print(result)

Example response:

{
  "model": "gemini-3-flash-preview",
  "response": "Observability is the ability to understand system state from outputs like logs, metrics, and traces.",
  "input_tokens": 10,
  "output_tokens": 24,
  "Total_tokens": 34,
  "Cost": "$0.00007700",
  "cost_details": {
    "value_usd": 0.000077,
    "input_rate_per_million": 0.5,
    "output_rate_per_million": 3.0
  }
}

Parameter: Stream

Setting Stream=True enables real-time streaming of generated text chunks. The invoke() method will return an iterator that yields text chunks as they are generated.

from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO,
  Stream=True,
)
stream = llm.invoke("Write a short note on clean architecture.")
for chunk in stream:
  print(chunk, end="", flush=True)
print()

Parameter: Retries and Timeouts

The package includes built-in retry logic with exponential backoff for transient errors. You can configure the retry behavior using the following parameters:

  • max_retries: Total number of retry attempts (default: 3)
  • timeout: Request timeout in seconds (default: 30.0)
  • backoff_factor: Multiplier for calculating delay between retries (default: 1.0)
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
  max_retries=5,
  timeout=60.0,
  backoff_factor=1.5
)

Vision Understanding Model (GoogleVisionModel)

Basic usage

from autourgos_google_modelkit import GoogleVisionModel, GOOGLE_VISION_MODEL_NAME

vision = GoogleVisionModel(
  model=GOOGLE_VISION_MODEL_NAME.GEMINI_3_FLASH_PREVIEW
  )
response = vision.invoke(
  prompt="Describe what is visible in this image.",
  image="./sample.jpg"
  )
print(response)

Example response:

The image contains a laptop on a desk, a coffee mug, and a notebook.
The main background is a white wall with soft daylight.

Streaming mode

from autourgos_google_modelkit import GoogleTextModel, GOOGLE_TEXT_MODEL_NAME

llm = GoogleTextModel(
	model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO,
	Stream=True,
)

stream = llm.invoke("Write a short note on clean architecture.")
for chunk in stream:
	print(chunk, end="", flush=True)
print()

Debugging chunk boundaries (optional):

stream = llm.invoke("Write a short note on clean architecture.")
for chunk in stream:
	print(repr(chunk))

Example streamed chunks (illustrative only):

'Clean architecture '
'separates business logic '
'from framework details.'

Note: The exact chunk boundaries are not fixed and can vary by SDK/model/network conditions

Final assembled response:

Clean architecture separates business logic from framework details.
It improves testability, long-term maintainability, and replacement of external dependencies.

Supported Parameters for Vision Model Initialization and Configuration

  • model: Model selection using enums or strings.
  • api_key: Explicit API key or environment variable resolution.
  • prompt_template: Reusable prompt templates with variable validation.
  • temperature, top_p, top_k, max_tokens: Sampling parameters for text generation.
  • thinking_level: Reasoning depth control for supported Gemini models.
  • structured_output: Option to receive response metadata instead of plain text.
  • Stream: Enable real-time streaming of generated text chunks.
  • media_resolution: Vision input quality hint (enum values: LOW, MEDIUM, HIGH).
  • max_retries, timeout, backoff_factor: Retry and timeout configurations for API calls.

Parameter: Media Resolution

The media_resolution parameter allows you to specify the quality of the vision input. Supported enum values are:

  • GOOGLE_VISION_MEDIA_RESOLUTION.LOW
  • GOOGLE_VISION_MEDIA_RESOLUTION.MEDIUM
  • GOOGLE_VISION_MEDIA_RESOLUTION.HIGH
from autourgos_google_modelkit import GoogleVisionModel, GOOGLE_VISION_MEDIA_RESOLUTION
vision = GoogleVisionModel(
  model=GOOGLE_VISION_MODEL_NAME.GEMINI_3_FLASH_PREVIEW,
  media_resolution=GOOGLE_VISION_MEDIA_RESOLUTION.HIGH
)

Note: Higher media resolution may improve model performance on complex images but can also increase latency and cost.

Validation and Errors

The package validates prompt content, sampling parameters, retry settings, and type constraints before making API calls.

Important behavior:

  • structured_output=True is only supported when Stream=False
  • top_p must be in [0.0, 1.0]
  • temperature must be in [0.0, 2.0]
  • max_retries must be >= 1

Error hierarchy:

  • Text: GoogleTextModelError and specialized subclasses
  • Vision: GoogleVisionModelError and specialized subclasses

Changelog

See CHANGELOG.md for full release history.

0.1.2 (2026-04-08)

  • Bumped package version to 0.1.2.
  • Updated minimum supported Python version to >=3.11.

0.1.1 (2026-03-24)

  • Fixed thinking_level compatibility by defaulting it to None for text and vision models.
  • Added explicit validation so unsupported models fail fast with a clear local error when thinking_level is set.
  • Added regression tests for unsupported thinking-level behavior.
  • Stabilized API-key-related tests by isolating environment variables in test cases.

References

Credits

Developed and maintained by DevxJitin
Documented by Sonia

Social Media

GitHub LinkedIn Whatsapp
GitHub Sonia LinkedIn Sonia

Contributing

Contributions are welcome! Please open issues for bugs or feature requests, and submit pull requests for improvements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autourgos_google_modelkit-0.1.2.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autourgos_google_modelkit-0.1.2-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file autourgos_google_modelkit-0.1.2.tar.gz.

File metadata

File hashes

Hashes for autourgos_google_modelkit-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6bda5d0600fcb8e4cb6305ba3b7c544c5afceae32d8c7bbe4c5ea7ac02f5f97f
MD5 aaee9a7f25dc6c7486edccee2ce7eede
BLAKE2b-256 7c06bed12c5064132ef141ea40674315fcd8a60115c18323c24d6498f5c5751b

See more details on using hashes here.

File details

Details for the file autourgos_google_modelkit-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for autourgos_google_modelkit-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 07fa3ca986cc72cb861c4c0e88b7efcab16979698a425a4e854352003fd0e901
MD5 223cdd91f33b7df9b7a810d0c62fa80b
BLAKE2b-256 6b6167caf80a43685708f10ed8370e0d68d5d0c1722bfbe27afa548af418f5ba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page