Skip to main content

Python Client library implementing the OpenResponses specification

Project description

openresponses-impl-client-google

Python client library for Google Gemini implementing the OpenResponses interface.

Overview

This package exposes a BaseResponsesClient-compatible Gemini client:

  • Non-streaming: returns ResponseResource
  • Streaming: returns AsyncIterator[ResponseStreamingEvent]
  • Request model: CreateResponseBody

It uses the Google GenAI SDK (google-genai) underneath, but normalizes requests and responses to OpenResponses models from openresponses-impl-core.

Installation

uv add openresponses-impl-client-google

Dependencies:

  • Python >=3.12
  • google-genai>=1.72.0
  • openresponses-impl-core>=0.1.0

Basic Usage

from openresponses_impl_core.models.openresponses_models import CreateResponseBody
from openresponses_impl_client_google.client.gemini_responses_client import GeminiResponsesClient


client = GeminiResponsesClient(
    model="gemini-3-flash-preview",
    google_api_key="YOUR_API_KEY",  # optional if GOOGLE_API_KEY / GEMINI_API_KEY is set
)

payload = CreateResponseBody(
    input="Hello",
    stream=False,
)

response = await client.create_response(payload=payload)
print(response.output)

Streaming:

payload = CreateResponseBody(
    input="Explain recursion briefly.",
    stream=True,
)

event_stream = await client.create_response(payload=payload)
async for event in event_stream:
    print(event.type)

Special Handling

System and developer instructions are merged

Gemini does not consume OpenResponses instructions, system messages, and developer messages in the same shape as OpenAI Responses API.

This client merges:

  • payload.instructions
  • input items with role="system"
  • input items with role="developer"

into a single Gemini system_instruction string.

Tool follow-up requires the same client instance

Gemini tool follow-up uses function_response, which requires the original function name.
OpenResponses function_call_output only carries call_id, so this client keeps an in-memory mapping:

  • call_id -> function name

Gemini function calls may also include thought_signature. This client treats it as Gemini-specific state and keeps a second in-memory mapping:

  • call_id -> thought_signature

Implications:

  • Tool follow-up must happen on the same GeminiResponsesClient instance.
  • Stateless replay from previous_response_id is not implemented.
  • If a function_call_output arrives for an unknown call_id, the client raises ValueError.

thought_signature is normalized and replayed for Gemini tool turns

Gemini may return thought_signature on a function_call part. This field is not part of the generic OpenResponses schema, so this client preserves it under Gemini-specific extensions:

{
  "type": "function_call",
  "call_id": "call_1",
  "name": "lookup_weather",
  "arguments": "{\"city\":\"Tokyo\"}",
  "extensions": {
    "google": {
      "thought_signature": "c2lnbmF0dXJlLTEyMw"
    }
  }
}

Special handling:

  • When Gemini returns thought_signature as bytes, it is converted to URL-safe base64 without padding before being exposed in extensions.google.thought_signature.
  • The client caches thought_signature by call_id in memory.
  • When a later OpenResponses input contains the same function_call again, the client restores the Gemini thought_signature bytes from extensions.google.thought_signature.
  • If that extension is omitted on replay, the client falls back to the cached value for the same call_id.

Implications:

  • Multi-turn tool flows should keep using the same GeminiResponsesClient instance.
  • If you persist tool state outside the process, persist both the OpenResponses item and extensions.google.thought_signature when present.
  • Invalid thought_signature payloads raise ValueError.

Native Gemini turn history is cached for tool follow-up

Gemini parallel tool calls are not stable if provider-native turns are flattened into independent OpenResponses items and then replayed one by one.

In particular, Gemini may emit multiple function_call parts in a single model turn, while only the first part carries thought_signature. If a later follow-up reconstructs those calls as separate Gemini turns, Gemini can reject the replay with:

  • 400 INVALID_ARGUMENT
  • Function call is missing a thought_signature

To avoid that, this client now keeps an in-memory cache of native Gemini contents history inside the GeminiResponsesClient instance.

Current behavior:

  • The first request is converted from OpenResponses input into Gemini contents as usual.
  • The native Gemini response turn (candidate.content) is appended to the same in-memory history.
  • Later follow-up requests are treated as delta input only.
  • Consecutive OpenResponses function_call items are regrouped into one Gemini ModelContent(parts=[...]).
  • Consecutive OpenResponses function_call_output items are regrouped into one Gemini UserContent(parts=[...]).
  • The actual Gemini request is built from cached native history + current delta, not from a stateless replay of flattened OpenResponses history.

Implications:

  • Multi-turn Gemini tool loops must reuse the same GeminiResponsesClient instance.
  • If your application creates a new client for every follow-up turn, Gemini tool replay may still fail.
  • previous_response_id is still not used as a stateless resume mechanism for Gemini in this package.
  • parallel_tool_calls remains an OpenResponses-level flag; the safety here comes from preserving Gemini-native turn structure, not from the flag itself.

Media input is normalized best-effort

OpenResponses input_image, input_file, and input_video are automatically converted to Gemini API format as follows:

1. Data URI format (URIs starting with data:)

# Example: data:image/png;base64,iVBORw0KGgoAAAANSUhEUg...
Message(
    role="user",
    content=[
        InputImage(image_url="data:image/png;base64,iVBORw0KGgoAAAANSUhEUg...")
    ]
)
  • Base64-encoded data is decoded and sent as byte array
  • Uses Gemini API's types.Part.from_bytes(data=..., mime_type=...)
  • Use case: Embedding small images/videos directly in requests

2. URI format (GCS, YouTube, HTTPS, etc.)

# Example: gs://bucket/video.mp4, https://example.com/image.jpg
Message(
    role="user",
    content=[
        InputVideo(video_url="gs://my-bucket/video.mp4")
    ]
)
  • URIs are sent as-is as references
  • Uses Gemini API's types.Part.from_uri(file_uri=..., mime_type=...)
  • Use case: Referencing files on GCS, YouTube videos, external URLs

3. Automatic MIME type inference

  • MIME type is automatically inferred from URI or filename extension (e.g., .mp4video/mp4)
  • Falls back to application/octet-stream if inference fails
  • Warning log is emitted on fallback

4. Files uploaded via Files API

# Pre-upload using Google GenAI SDK
video_file = genai_client.files.upload(file="path/to/video.mp4")
# Pass directly to input
payload = CreateResponseBody(input=[video_file, ...])
  • Uploaded File objects can be included directly in input
  • Gemini API handles them appropriately internally

Recommended method for video input (Files API)

For long or large video files, we recommend pre-uploading via Files API before using this library:

from google import genai
from openresponses_impl_core.models.openresponses_models import CreateResponseBody, Message
from openresponses_impl_client_google.client.gemini_responses_client import GeminiResponsesClient
import time

# 1. Upload video using Google GenAI SDK
genai_client = genai.Client()
video_file = genai_client.files.upload(file="path/to/video.mp4")

# 2. Wait for processing completion (videos may require processing time)
while True:
    video_file = genai_client.files.get(name=video_file.name)
    if video_file.state != "PROCESSING":
        break
    time.sleep(2)

# 3. Analyze using OpenResponses client
responses_client = GeminiResponsesClient(
    model="gemini-3-flash-preview",
    google_api_key="YOUR_API_KEY",
)

payload = CreateResponseBody(
    input=[
        video_file,  # Pass uploaded File object directly
        Message(role="user", content="Summarize this video and list key points in bullet format.")
    ],
    stream=False,
)

response = await responses_client.create_response(payload=payload)
print(response.output)

# 4. Cleanup (optional)
genai_client.files.delete(name=video_file.name)

Notes:

  • File objects uploaded via Files API can be included directly in the input array
  • Wait for video processing to complete (PROCESSINGACTIVE) before use
  • For small videos, you can also use data: URIs or GCS URIs

Reasoning is mapped approximately

OpenResponses reasoning settings do not map 1:1 to Gemini.

Current behavior:

  • reasoning.effort="none" -> thinking_budget=0
  • reasoning.effort="low" | "medium" | "high" -> Gemini thinking_level
  • reasoning.effort="xhigh" -> mapped to HIGH with warning
  • reasoning.summary -> ignored with warning

Response fields are partially synthesized

Gemini does not return an object identical to ResponseResource, so some fields are reconstructed from:

  • request payload
  • Gemini response metadata
  • client-generated fallback IDs and timestamps

Examples:

  • id falls back to a synthetic response ID if Gemini does not return response_id
  • tools, tool_choice, text, service_tier, and similar fields are echoed from the effective request
  • Gemini-specific metadata is stored under metadata["gemini_*"]

OpenResponses Compatibility Notes

This client is intentionally best-effort. It keeps the OpenResponses public interface, but not every field can be represented natively by Gemini.

Supported well

  • plain text input
  • user / assistant message history
  • function tools
  • Gemini built-in tools expressed as OpenAI-style flat tool objects
  • non-streaming responses
  • basic streaming text responses
  • JSON-schema-style structured output via Gemini response_json_schema

Supported with translation

  • instructions, system, developer -> merged system_instruction
  • function_call -> Gemini function_call
  • function_call_output -> Gemini function_response
  • reasoning text / thought parts -> OpenResponses reasoning
  • usage fields -> mapped from Gemini usage_metadata

Warning-and-ignore fields

These fields are preserved in the normalized ResponseResource when possible, but are not sent to Gemini as functional request controls:

  • previous_response_id
  • store
  • background
  • parallel_tool_calls
  • max_tool_calls
  • truncation
  • include
  • safety_identifier
  • prompt_cache_key

Unsupported or partially supported behavior

  • previous_response_id

    • Gemini request execution ignores it.
    • The field is kept in normalized responses for interface compatibility only.
  • generic tools

    • Gemini built-in tools are converted dynamically when tool.type matches a supported google.genai.types.Tool field with object-style configuration.
    • When at least one Gemini built-in tool is present, the client automatically sets tool_config.include_server_side_tool_invocations = true.
    • Built-in tool config follows the OpenAI-style flat request shape, for example:
      • {"type": "google_maps", "enable_widget": true}
      • {"type": "file_search", "file_search_store_names": ["fileSearchStores/STORE_ID"], "top_k": 5}
      • {"type": "code_execution"}
    • description is preserved in the echoed request, but is not sent to Gemini as executable tool config.
    • Unknown generic tool types are ignored with warning.
    • Known Gemini built-in tool types with invalid config fail fast with ValueError.
    • Built-in outputs are still normalized best-effort; dedicated loops such as computer-use action roundtrips are not yet mapped to OpenResponses-specific output items.
  • item types without Gemini equivalents

    • item_reference is ignored with warning.
    • OpenResponses input reasoning items are ignored with warning.
  • exact tool-choice fidelity

    • The client translates tool choice to Gemini function-calling config best-effort.
    • OpenResponses/Core model serialization may already be lossy for some tool_choice shapes before the Gemini client sees them.

Streaming Semantics

Streaming is normalized to a minimal OpenResponses event set.

Currently emitted event families:

  • response.created
  • response.output_item.added
  • response.content_part.added
  • response.output_text.delta
  • response.output_text.done
  • response.content_part.done
  • response.output_item.done
  • response.reasoning.delta
  • response.reasoning.done
  • response.function_call_arguments.done
  • terminal event:
    • response.completed
    • response.incomplete
    • response.failed
  • error

Notes:

  • Gemini stream chunks are merged cumulatively before event translation.
  • Delta calculation is prefix-based best-effort.
  • If Gemini emits unexpected chunk shapes, the client may emit an error event.

Status Mapping

Gemini finish state is mapped to OpenResponses status as follows:

  • prompt blocked / no candidate -> failed
  • MAX_TOKENS -> incomplete with reason="max_tokens"
  • STOP -> completed
  • response containing function calls -> completed
  • other Gemini finish reasons -> incomplete with the Gemini reason string

Authentication

You can pass google_api_key= directly, or rely on the Google SDK environment variable resolution.

Common environment variables:

  • GOOGLE_API_KEY
  • GEMINI_API_KEY

Logging Behavior

This client uses warnings for non-fatal incompatibilities.
You should expect warning logs when:

  • unsupported OpenResponses fields are provided
  • unsupported generic tool types are provided
  • unsupported item/content types are provided
  • MIME type inference falls back to application/octet-stream
  • reasoning.summary is requested
  • reasoning.effort="xhigh" is downgraded

When the logger for openresponses_impl_client_google.client.gemini_responses_client is set to DEBUG, the client also logs Gemini request and response payloads:

  • non-streaming requests before generate_content
  • non-streaming raw Gemini responses after generate_content
  • each streaming chunk from generate_content_stream
  • the final aggregated streaming payload before OpenResponses terminal events are emitted

Testing

Run tests with:

UV_CACHE_DIR="$PWD/.uv_cache" uv run pytest -q

This now includes live Gemini API integration tests under test/integration_test/.

Live test behavior:

  • the test module reads GOOGLE_API_KEY or GEMINI_API_KEY from the repo-root .env file if the variables are not already exported
  • if the key is missing or invalid, pytest fails immediately during test collection
  • normal pytest -q requires network access and may incur Gemini API cost
  • the live suite covers non-stream, stream, function-call follow-up, JSON schema output, and reasoning smoke paths

Summary

Use this package when you want Gemini behind the OpenResponses interface, but keep in mind:

  • it is interface-compatible, not wire-compatible
  • several OpenResponses controls are emulated or ignored
  • tool follow-up depends on client-local in-memory state
  • streaming is normalized to a practical subset rather than a perfect Gemini-to-Responses projection

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openresponses_impl_client_google-0.2.5.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file openresponses_impl_client_google-0.2.5.tar.gz.

File metadata

File hashes

Hashes for openresponses_impl_client_google-0.2.5.tar.gz
Algorithm Hash digest
SHA256 d79efc066d43be577176610474f62583218aca6ba73bd1795ce856e01b0b5993
MD5 b4016beb76883738c6f8744e66bd553c
BLAKE2b-256 48990d686cdf3772c6dd0d96c0a7076cc9b63030e5f2cde14bd42941594a95eb

See more details on using hashes here.

File details

Details for the file openresponses_impl_client_google-0.2.5-py3-none-any.whl.

File metadata

File hashes

Hashes for openresponses_impl_client_google-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b74f2ee381ab631d2418db5dd4642c9b336decd1650a543f5104f21cf62653cb
MD5 78989cd46268af1c6bea4664ffd3ed86
BLAKE2b-256 2dfa916bf2257ec8303fe23efd97596f394195f0a4417d23701e87d9b7869642

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page