Google Gemini text and vision model wrappers for Autourgos
Project description
Autourgos Google Model Kit
Autourgos is a Python framework that helps developers build AI agents, from simple bots to advanced systems. It will contains tools, autonomous tools, memory, llm agents, and many more features to make it easy to create intelligent Agents.
The Autourgos Google Model Kit is a package that connects this framework to Google's Gemini AI models. It provides simple, ready-to-use wrappers for handling both text and vision (image-based) tasks. Features like automatic retries, input validation, and cost tracking are built-in, making it straightforward to add Google's powerful AI capabilities to your projects in a reliable way.
This package gives you two Model Wrappers for Google Gemini APIs:
GoogleTextModelfor text generationGoogleVisionModelfor image + text prompts with text output
It focuses on clean API usage, validation, retries, and structured response metadata.
Table of Contents
- Why Use This Package
- Installation and API Key Setup
- Text generation (GoogleTextModel)
- Basic usage
- Model Initialization and Configuration
- Base Setup
- Parameter: Model
- Parameter: API Key
- Parameter: Prompt Template
- Parameter: Temperature
- Parameter: Top P
- Parameter: Top K
- Parameter: Max Tokens
- Parameter: Thinking Level
- Parameter: Structured Output
- Parameter: Stream
- Parameter: Retries and Timeouts
- Vision generation (GoogleVisionModel)
- Validation and Errors
- Changelog
- References
- Credits
- Social Media
- Contributing
Why Use This Package
- Typed Model Enums: Safer model selection using built-in enums (
GOOGLE_TEXT_MODEL_NAME, etc.). - Consistent API: One unified class interface (
invoke) across both text and vision models. - Streaming Support: Optional real-time streaming mode for token-by-token generation.
- Structured Output: Access response text alongside token usage and estimated cost metadata.
- Prompt Templates: Reusable templates with strict variable validation.
- Advanced Capabilities: Full support for Gemini 3 thinking levels and vision media resolution tuning.
- Resilience: Built-in retry mechanism with exponential backoff and timeout configurations.
- Flexible Configuration: API key resolution from explicit arguments or standard environment variables.
Installation and API Key Setup
Install the package:
pip install autourgos-google-modelkit
Set API key in environment variables: PowerShell:
$env:GOOGLE_API_KEY = "your-api-key"
Set API key in environment variables: Bash:
export GOOGLE_API_KEY="your-api-key"
Text generation (GoogleTextModel)
Basic usage
from autourgos_google_modelkit import GoogleTextModel, GOOGLE_TEXT_MODEL_NAME
llm = GoogleTextModel(
model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO,
)
print(llm.invoke("Explain RAG in simple terms."))
Example response
RAG (Retrieval-Augmented Generation) combines search and generation.
The model first retrieves relevant knowledge, then writes an answer using that context.
This improves factual accuracy and reduces hallucinations.
Model Initialization and Configuration
Base Setup
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel()
Parameter: Model
Model selection can be done using enums or strings. Enums provide better safety and autocomplete, while strings offer flexibility.
Supported models include:
GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO(string: "gemini-3.1-pro")GOOGLE_TEXT_MODEL_NAME.GEMINI_3_FLASH_PREVIEW(string: "gemini-3-flash-preview")GOOGLE_TEXT_MODEL_NAME.GEMINI_2_5_FLASH(string: "gemini-2.5-flash")GOOGLE_TEXT_MODEL_NAME.GEMINI_2_5_PRO(string: "gemini-2.5-pro")
Using enums (recommended) for better safety and autocomplete:
from autourgos_google_modelkit import GOOGLE_TEXT_MODEL_NAME, GoogleTextModel
llm = GoogleTextModel(
model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO
)
print(llm.invoke("Explain RAG in simple terms."))
Using strings:
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
model="gemini-3.1-pro"
)
print(llm.invoke("Explain Agentic AI in simple terms."))
Parameter: API Key
Explicitly setting the API key:
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
api_key="AIzaSy..."
)
Relying on environment variable (recommended):
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel()
Parameter: Prompt Template
Setting a reusable prompt template with variables
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
prompt_template="Explain {topic} in simple terms."
)
print(llm.invoke(prompt_variables={"topic": "RAG"}))
Parameter: Temperature
Temperature controls randomness in generation. Google suggests values between 0.0 and 2.0.
- Lower values (e.g., 0.0) produce more deterministic output.
- Moderate values (e.g., 1.0) balance coherence and creativity.
- Higher values (e.g., 2.0) produce more creative and varied output.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
temperature=0.7
)
Parameter: Top P
Top-p (nucleus sampling) controls diversity by limiting token selection to a cumulative probability threshold. Valid values are between 0.0 and 1.0.
- Lower values (e.g., 0.0) produce more deterministic output.
- Moderate values (e.g., 0.9) allow for more diversity while maintaining coherence.
- Higher values (e.g., 1.0) produce more diverse and creative output.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
top_p=0.9
)
Parameter: Top K
Top-k (top-k sampling) limits token selection to the top-k most likely tokens. Valid values are between 1 and 40.
- Lower values (e.g., 1) produce more deterministic output.
- Moderate values (e.g., 20) allow for more diversity while maintaining coherence.
- Higher values (e.g., 40) produce more diverse and creative output.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
top_k=40
)
Parameter: Max Tokens
Max tokens sets the maximum output token budget for the response.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
max_tokens=1024
)
Parameter: Thinking Level
Thinking level controls the depth of reasoning for supported Gemini models. Valid values are:
GOOGLE_TEXT_THINKING_LEVEL.LOWGOOGLE_TEXT_THINKING_LEVEL.MEDIUMGOOGLE_TEXT_THINKING_LEVEL.HIGH
from autourgos_google_modelkit import GoogleTextModel, GOOGLE_TEXT_THINKING_LEVEL
llm = GoogleTextModel(
thinking_level=GOOGLE_TEXT_THINKING_LEVEL.HIGH
)
Note: Higher thinking levels may improve reasoning quality but can also increase latency and cost.
Note: The
thinking_levelparameter is only supported by Gemini 3.1 Pro, Gemini 3 Flash Preview models and Gemini 3.1 Flash Lite models. Using it with unsupported models will raise a validation error.
Parameter: Structured Output
Setting structured_output=True returns a dictionary with response text and metadata instead of plain text.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
structured_output=True
)
result = llm.invoke("Summarize observability in one paragraph.")
print(result)
Example response:
{
"model": "gemini-3-flash-preview",
"response": "Observability is the ability to understand system state from outputs like logs, metrics, and traces.",
"input_tokens": 10,
"output_tokens": 24,
"Total_tokens": 34,
"Cost": "$0.00007700",
"cost_details": {
"value_usd": 0.000077,
"input_rate_per_million": 0.5,
"output_rate_per_million": 3.0
}
}
Parameter: Stream
Setting Stream=True enables real-time streaming of generated text chunks. The invoke() method will return an iterator that yields text chunks as they are generated.
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO,
Stream=True,
)
stream = llm.invoke("Write a short note on clean architecture.")
for chunk in stream:
print(chunk, end="", flush=True)
print()
Parameter: Retries and Timeouts
The package includes built-in retry logic with exponential backoff for transient errors. You can configure the retry behavior using the following parameters:
max_retries: Total number of retry attempts (default: 3)timeout: Request timeout in seconds (default: 30.0)backoff_factor: Multiplier for calculating delay between retries (default: 1.0)
from autourgos_google_modelkit import GoogleTextModel
llm = GoogleTextModel(
max_retries=5,
timeout=60.0,
backoff_factor=1.5
)
Vision Understanding Model (GoogleVisionModel)
Basic usage
from autourgos_google_modelkit import GoogleVisionModel, GOOGLE_VISION_MODEL_NAME
vision = GoogleVisionModel(
model=GOOGLE_VISION_MODEL_NAME.GEMINI_3_FLASH_PREVIEW
)
response = vision.invoke(
prompt="Describe what is visible in this image.",
image="./sample.jpg"
)
print(response)
Example response:
The image contains a laptop on a desk, a coffee mug, and a notebook.
The main background is a white wall with soft daylight.
Streaming mode
from autourgos_google_modelkit import GoogleTextModel, GOOGLE_TEXT_MODEL_NAME
llm = GoogleTextModel(
model=GOOGLE_TEXT_MODEL_NAME.GEMINI_3_1_PRO,
Stream=True,
)
stream = llm.invoke("Write a short note on clean architecture.")
for chunk in stream:
print(chunk, end="", flush=True)
print()
Debugging chunk boundaries (optional):
stream = llm.invoke("Write a short note on clean architecture.")
for chunk in stream:
print(repr(chunk))
Example streamed chunks (illustrative only):
'Clean architecture '
'separates business logic '
'from framework details.'
Note: The exact chunk boundaries are not fixed and can vary by SDK/model/network conditions
Final assembled response:
Clean architecture separates business logic from framework details.
It improves testability, long-term maintainability, and replacement of external dependencies.
Supported Parameters for Vision Model Initialization and Configuration
model: Model selection using enums or strings.api_key: Explicit API key or environment variable resolution.prompt_template: Reusable prompt templates with variable validation.temperature,top_p,top_k,max_tokens: Sampling parameters for text generation.thinking_level: Reasoning depth control for supported Gemini models.structured_output: Option to receive response metadata instead of plain text.Stream: Enable real-time streaming of generated text chunks.media_resolution: Vision input quality hint (enum values: LOW, MEDIUM, HIGH).max_retries,timeout,backoff_factor: Retry and timeout configurations for API calls.
Parameter: Media Resolution
The media_resolution parameter allows you to specify the quality of the vision input. Supported enum values are:
GOOGLE_VISION_MEDIA_RESOLUTION.LOWGOOGLE_VISION_MEDIA_RESOLUTION.MEDIUMGOOGLE_VISION_MEDIA_RESOLUTION.HIGH
from autourgos_google_modelkit import GoogleVisionModel, GOOGLE_VISION_MEDIA_RESOLUTION
vision = GoogleVisionModel(
model=GOOGLE_VISION_MODEL_NAME.GEMINI_3_FLASH_PREVIEW,
media_resolution=GOOGLE_VISION_MEDIA_RESOLUTION.HIGH
)
Note: Higher media resolution may improve model performance on complex images but can also increase latency and cost.
Validation and Errors
The package validates prompt content, sampling parameters, retry settings, and type constraints before making API calls.
Important behavior:
structured_output=Trueis only supported whenStream=Falsetop_pmust be in[0.0, 1.0]temperaturemust be in[0.0, 2.0]max_retriesmust be>= 1
Error hierarchy:
- Text:
GoogleTextModelErrorand specialized subclasses - Vision:
GoogleVisionModelErrorand specialized subclasses
Changelog
See CHANGELOG.md for full release history.
0.1.2 (2026-04-08)
- Bumped package version to
0.1.2. - Updated minimum supported Python version to
>=3.11.
0.1.1 (2026-03-24)
- Fixed
thinking_levelcompatibility by defaulting it toNonefor text and vision models. - Added explicit validation so unsupported models fail fast with a clear local error when
thinking_levelis set. - Added regression tests for unsupported thinking-level behavior.
- Stabilized API-key-related tests by isolating environment variables in test cases.
References
Credits
Developed and maintained by DevxJitin
Documented by Sonia
Social Media
Contributing
Contributions are welcome! Please open issues for bugs or feature requests, and submit pull requests for improvements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autourgos_google_modelkit-0.1.2.tar.gz.
File metadata
- Download URL: autourgos_google_modelkit-0.1.2.tar.gz
- Upload date:
- Size: 27.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bda5d0600fcb8e4cb6305ba3b7c544c5afceae32d8c7bbe4c5ea7ac02f5f97f
|
|
| MD5 |
aaee9a7f25dc6c7486edccee2ce7eede
|
|
| BLAKE2b-256 |
7c06bed12c5064132ef141ea40674315fcd8a60115c18323c24d6498f5c5751b
|
File details
Details for the file autourgos_google_modelkit-0.1.2-py3-none-any.whl.
File metadata
- Download URL: autourgos_google_modelkit-0.1.2-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07fa3ca986cc72cb861c4c0e88b7efcab16979698a425a4e854352003fd0e901
|
|
| MD5 |
223cdd91f33b7df9b7a810d0c62fa80b
|
|
| BLAKE2b-256 |
6b6167caf80a43685708f10ed8370e0d68d5d0c1722bfbe27afa548af418f5ba
|