Example of Production-ready Google Gemini API wrapper with SRE features

These details have not been verified by PyPI

Project links

Project description

Building an Enterprise-Ready Gemini Client

This repository provides a production-ready Python client for the Google Gemini API, serving as a comprehensive example for building robust, enterprise-grade LLM services. It includes key SRE principles like automatic retries, multi-region failover, a circuit breaker pattern, and deep integration with Google Cloud's observability suite (Monitoring and Logging).

✨ Features

✅ Automatic Retry with Exponential Backoff - Resilient API calls with configurable retry logic
✅ Multi-Region Failover - Automatic region switching on failures for high availability
✅ Circuit Breaker Pattern - Intelligent region health tracking to avoid wasting time/quota on failing regions
✅ Cloud Monitoring Integration - Custom metrics for latency, retry count, success/failure rates
✅ Structured Output - Type-safe responses with Pydantic schema validation
✅ Structured Logging - Integration with Google Cloud Logging
✅ Async Support - Full async/await support with AsyncGeminiSREClient
✅ Production Ready - Comprehensive error handling and observability
✅ File Operations - Upload and manage files with automatic deduplication

📁 Project Structure

gemini-sre-client/
├── gemini_sre/              # Main package
│   ├── client.py            # Synchronous client
│   ├── async_client.py      # Asynchronous client
│   ├── core/                # Core functionality
│   │   ├── circuit_breaker.py
│   │   ├── retry.py
│   │   ├── monitoring.py
│   │   ├── logging.py
│   │   ├── deduplication.py
│   │   └── streaming.py
│   └── proxies/             # SDK proxies
│       ├── models.py
│       ├── chats.py
│       ├── files.py
│       └── async_*.py       # Async versions
├── examples/                # 16 working examples
│   ├── basic/               # 4 basic examples
│   ├── advanced/            # 5 advanced examples
│   ├── async/               # 4 async examples
│   └── production/          # 3 production examples
├── tests/                   # Test suite
│   ├── unit/                # Unit tests
│   └── integration/         # Integration tests
├── docs/                    # Documentation
│   ├── api/                 # API reference
│   ├── architecture/        # Technical docs
│   └── development/         # Dev docs
├── README.md                # This file
├── SETUP.md                 # Setup instructions
├── setup.py                 # Package configuration
└── requirements.txt         # Dependencies

🚀 Quick Start

1. Installation

# Clone the repository
git clone <repository-url>
cd gemini-computer-use

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# .venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Configure authentication
gcloud auth application-default login

2. Configuration

Copy the environment template and configure your project:

# Copy template
cp .env.example .env.local

# Edit .env.local and set your project ID
echo "GOOGLE_CLOUD_PROJECT=your-project-id" > .env.local

See SETUP.md for detailed setup instructions.

3. Run Your First Example

# Run basic generation example
.venv/bin/python examples/basic/01_simple_generation.py

# Try streaming
.venv/bin/python examples/basic/02_streaming.py

# Explore all examples
ls examples/

📖 Usage

Basic Content Generation

import os
from dotenv import load_dotenv
from gemini_sre import GeminiSREClient

# Load environment variables
load_dotenv('.env.local', override=True)

# Initialize client
client = GeminiSREClient(
    project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
    locations=["us-central1", "europe-west1"],
    enable_monitoring=False,  # Set to True if you have IAM role
    enable_logging=False,     # Set to True if you have IAM role
)

# Generate content
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain quantum computing in simple terms",
    request_id="example-001",
)

print(response.text)

Streaming Responses

# Stream content for real-time display
for chunk in client.models.generate_content_stream(
    model="gemini-2.5-flash",
    contents="Write a story about a robot learning to paint",
    request_id="stream-001",
):
    print(chunk.text, end="", flush=True)

Chat Operations

# Create chat session with context preservation
chat = client.chats.create(
    model="gemini-2.5-flash",
    request_id="chat-001",
)

# Send messages
response1 = chat.send_message("Hello! My name is Alice.")
response2 = chat.send_message("What's my name?")  # Model remembers!
print(response2.text)  # "Your name is Alice"

Structured Output with Pydantic

from pydantic import BaseModel, Field
from typing import List

# Define your schema
class Recipe(BaseModel):
    name: str = Field(description="Recipe name")
    ingredients: List[str] = Field(description="List of ingredients")
    steps: List[str] = Field(description="Cooking steps")
    cooking_time: int = Field(description="Time in minutes")

# Generate structured output
recipe = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Give me a recipe for chocolate chip cookies",
    config={
        "response_mime_type": "application/json",
        "response_schema": Recipe,
    },
    request_id="recipe-001",
)

# Access typed fields
print(recipe.parsed.name)
print(recipe.parsed.ingredients)

Async Operations

import asyncio
from gemini_sre import AsyncGeminiSREClient

async def main():
    # Create async client
    client = AsyncGeminiSREClient(
        project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
        locations=["us-central1"],
    )

    # Async generation
    response = await client.models.generate_content(
        model="gemini-2.5-flash",
        contents="Explain async programming",
        request_id="async-001",
    )

    print(response.text)

asyncio.run(main())

Concurrent Requests (4.47x Faster!)

import asyncio

async def make_requests():
    client = AsyncGeminiSREClient(
        project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
        locations=["us-central1"],
    )

    # Run 5 requests concurrently
    tasks = [
        client.models.generate_content(
            model="gemini-2.5-flash",
            contents=f"What is {topic}?",
            request_id=f"req-{i}",
        )
        for i, topic in enumerate(["Python", "JavaScript", "Go", "Rust", "TypeScript"])
    ]

    # Wait for all to complete
    results = await asyncio.gather(*tasks)
    return results

# Sequential: ~65s, Concurrent: ~15s (4.47x faster!)
asyncio.run(make_requests())

📚 Examples

We provide 16 comprehensive examples organized by complexity:

Basic Examples (4)

01 - Simple Generation - Basic content generation
02 - Streaming - Real-time streaming responses
03 - Chat Operations - Stateful conversations
04 - Structured Output - Pydantic schema validation

Advanced Examples (5)

01 - Multi-Region - Multi-region failover
02 - Circuit Breaker - Circuit breaker pattern
03 - Custom Retry - Custom retry strategies
04 - Monitoring - Cloud Monitoring integration
05 - File Operations - File upload/management

Async Examples (4)

01 - Async Basic - Basic async operations
02 - Async Streaming - Async streaming
03 - Concurrent Requests - Parallel requests (4.47x speedup!)
04 - Async Chat - Async chat sessions

Production Examples (3)

01 - Error Handling - Comprehensive error patterns
02 - Logging Setup - Logging configuration
03 - Best Practices - Production deployment guide

See examples/README.md for detailed descriptions.

🔧 Configuration

Environment Variables

The client uses standard Google Cloud environment variables for consistency with the genai SDK:

GOOGLE_CLOUD_PROJECT - Your GCP project ID (required)
GOOGLE_CLOUD_LOCATION - Default region (optional)
GEMINI_ENABLE_MONITORING - Enable metrics (optional)
GEMINI_ENABLE_LOGGING - Enable logging (optional)

Client Configuration

from gemini_sre import GeminiSREClient
from gemini_sre.core import RetryConfig

# Full configuration example
client = GeminiSREClient(
    project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
    locations=["us-central1", "europe-west1", "asia-northeast1"],

    # Monitoring & Logging
    enable_monitoring=True,   # Requires roles/monitoring.metricWriter
    enable_logging=True,      # Requires roles/logging.logWriter

    # Retry Configuration
    retry_config=RetryConfig(
        max_attempts=5,
        initial_delay=1.0,
        max_delay=16.0,
        multiplier=2.0,
    ),

    # Circuit Breaker
    enable_circuit_breaker=True,
    circuit_breaker_config={
        "failure_threshold": 5,
        "success_threshold": 2,
        "timeout": 60,
    },
)

📊 Monitoring & Observability

Cloud Monitoring Metrics

The client automatically sends custom metrics to Cloud Monitoring (when enabled):

Metric	Type	Description
`gemini_sre/request/success`	COUNTER	Successful requests
`gemini_sre/request/error`	COUNTER	Failed requests
`gemini_sre/request/latency`	DISTRIBUTION	Request latency (p50, p95, p99)
`gemini_sre/request/retry_count`	GAUGE	Retry attempts per request
`gemini_sre/circuit_breaker/state`	GAUGE	Circuit breaker state

All metrics include labels: location, model, operation_type

Cloud Logging

Structured logs include:

Request ID for correlation
Latency and retry counts
Region information
Error details
Success/failure status

View logs in Cloud Console.

🔐 IAM Permissions

Minimum Required:

roles/aiplatform.user - Vertex AI API access

Optional (for full features):

roles/monitoring.metricWriter - Cloud Monitoring metrics
roles/logging.logWriter - Cloud Logging integration

Set up IAM roles:

# Grant minimum role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="user:your-email@example.com" \
    --role="roles/aiplatform.user"

📖 Documentation

Setup Guide - Detailed installation and configuration
API Reference - Complete API documentation
Architecture Docs - Technical design and decisions
Development Guide - For contributors
Examples - 16 working code examples

🧪 Testing

# Run unit tests
pytest tests/unit/ -v

# Run integration tests (requires GCP credentials)
pytest tests/integration/ -v

# Run specific test
pytest tests/unit/test_circuit_breaker.py -v

🏗️ Multi-Region Failover

The client automatically switches regions on failures:

Primary Region - Tries configured primary (e.g., us-central1)
Failover - Switches to next region on error
Circuit Breaker - Opens circuit for failing regions (skips them)
Recovery - Tests region health after timeout
Auto-Close - Closes circuit when region recovers

Circuit Breaker States:

CLOSED (✅) - Normal operation, region healthy
OPEN (🔴) - Region failing, automatically skipped
HALF_OPEN (🟡) - Testing if region recovered

🔍 Troubleshooting

Common Issues

Permission Denied:

# Verify authentication
gcloud auth application-default login

# Check project
gcloud config get-value project

# Verify IAM roles
gcloud projects get-iam-policy YOUR_PROJECT_ID

Module Not Found:

# Install in development mode
pip install -e .

Import Errors:

# Correct import
from gemini_sre import GeminiSREClient  # ✅ Correct

# Incorrect import
from gemini_client import GeminiClient  # ❌ Old name

See SETUP.md for more troubleshooting tips.

📦 Dependencies

Core dependencies:

google-genai>=1.42.0              # Gemini API SDK
google-cloud-monitoring>=2.27.0   # Custom metrics
google-cloud-logging>=3.12.0      # Structured logging
pydantic>=2.12.0,<3.0.0          # Schema validation
python-dotenv>=1.0.0             # Environment management

See requirements.txt for full list.

🔗 Useful Links

Google Gemini

Pydantic

Google Cloud

🤝 Contributing

Contributions are welcome! Please see our development documentation:

Fork the repository
Create a feature branch
Add tests for new features
Ensure all tests pass
Submit a pull request

See docs/development/ for contributor guidelines.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

TL;DR: You can freely use, modify, and distribute this code in commercial or non-commercial projects with no restrictions. The author provides no warranty and accepts no liability.

For more information about the license, including what you can and cannot do, see LICENSE_GUIDE.md.

Ready to get started? Check out SETUP.md for detailed setup instructions, or dive into examples/ to see working code!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Dec 21, 2025

0.0.1

Oct 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemini_sre-0.1.0.tar.gz (30.6 kB view details)

Uploaded Dec 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gemini_sre-0.1.0-py3-none-any.whl (37.1 kB view details)

Uploaded Dec 21, 2025 Python 3

File details

Details for the file gemini_sre-0.1.0.tar.gz.

File metadata

Download URL: gemini_sre-0.1.0.tar.gz
Upload date: Dec 21, 2025
Size: 30.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for gemini_sre-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9ed2d8d3e498e1f2dd0b1e1b3d22ffc8c8d69ec3efc4c8eb26259c4b0b351787`
MD5	`3a9dc7de08301a6f42e5ceefd9127d31`
BLAKE2b-256	`a4e52d006bd342087ad5aabc8ea48f95e40b81c9334951762dcd0934a32b7343`

See more details on using hashes here.

File details

Details for the file gemini_sre-0.1.0-py3-none-any.whl.

File metadata

Download URL: gemini_sre-0.1.0-py3-none-any.whl
Upload date: Dec 21, 2025
Size: 37.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for gemini_sre-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ea3fc487db768342fb38ffb46430fb63ca933378dd4c67fe0776b8bb69cad64e`
MD5	`cd3ad8394d2ff0381f0a9b0e819a49f5`
BLAKE2b-256	`80f4944fc3cf81effec97dab9b44b34563cf5ae7ed622f7295f2a927bd76db7a`

See more details on using hashes here.

gemini-sre 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Building an Enterprise-Ready Gemini Client

✨ Features

📁 Project Structure

🚀 Quick Start

1. Installation

2. Configuration

3. Run Your First Example

📖 Usage

Basic Content Generation

Streaming Responses

Chat Operations

Structured Output with Pydantic

Async Operations

Concurrent Requests (4.47x Faster!)

📚 Examples

Basic Examples (4)

Advanced Examples (5)

Async Examples (4)

Production Examples (3)

🔧 Configuration

Environment Variables

Client Configuration

📊 Monitoring & Observability

Cloud Monitoring Metrics

Cloud Logging

🔐 IAM Permissions

📖 Documentation

🧪 Testing

🏗️ Multi-Region Failover

🔍 Troubleshooting

Common Issues

📦 Dependencies

🔗 Useful Links

Google Gemini

Pydantic

Google Cloud

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes