Python Client library implementing the OpenResponses specification

These details have not been verified by PyPI

Project links

Project description

openresponses-impl-client-google

Python client library for Google Gemini implementing the OpenResponses interface.

Overview

This package exposes a BaseResponsesClient-compatible Gemini client:

Non-streaming: returns ResponseResource
Streaming: returns AsyncIterator[ResponseStreamingEvent]
Request model: CreateResponseBody

It uses the Google GenAI SDK (google-genai) underneath, but normalizes requests and responses to OpenResponses models from openresponses-impl-core.

Installation

uv add openresponses-impl-client-google

Dependencies:

Python >=3.12
google-genai>=1.72.0
openresponses-impl-core>=0.1.0

Basic Usage

from openresponses_impl_core.models.openresponses_models import CreateResponseBody
from openresponses_impl_client_google.client.gemini_responses_client import GeminiResponsesClient


client = GeminiResponsesClient(
    model="gemini-3-flash-preview",
    google_api_key="YOUR_API_KEY",  # optional if GOOGLE_API_KEY / GEMINI_API_KEY is set
)

payload = CreateResponseBody(
    input="Hello",
    stream=False,
)

response = await client.create_response(payload=payload)
print(response.output)

Streaming:

payload = CreateResponseBody(
    input="Explain recursion briefly.",
    stream=True,
)

event_stream = await client.create_response(payload=payload)
async for event in event_stream:
    print(event.type)

Special Handling

System and developer instructions are merged

Gemini does not consume OpenResponses instructions, system messages, and developer messages in the same shape as OpenAI Responses API.

This client merges:

payload.instructions
input items with role="system"
input items with role="developer"

into a single Gemini system_instruction string.

For multi-turn flows on the same GeminiResponsesClient instance, system and developer messages are cached as sticky context. instructions are not cached; they are evaluated per request. On every turn, the client rebuilds the effective Gemini system_instruction as:

current payload.instructions
cached sticky system/developer context

If a follow-up turn omits instructions, the previous turn's instructions does not carry over. If a later turn provides new system or developer messages, the sticky context is replaced rather than appended again.

Tool follow-up requires the same client instance

Gemini tool follow-up uses function_response, which requires the original function name.
OpenResponses function_call_output only carries call_id, so this client keeps an in-memory mapping:

call_id -> function name

Gemini function calls may also include thought_signature. This client treats it as Gemini-specific state and keeps a second in-memory mapping:

call_id -> thought_signature

Implications:

Tool follow-up must happen on the same GeminiResponsesClient instance.
Stateless replay from previous_response_id is not implemented.
If a function_call_output arrives for an unknown call_id, the client raises ValueError.

`thought_signature` is normalized and replayed for Gemini tool turns

Gemini may return thought_signature on a function_call part. This field is not part of the generic OpenResponses schema, so this client preserves it under Gemini-specific extensions:

{
  "type": "function_call",
  "call_id": "call_1",
  "name": "lookup_weather",
  "arguments": "{\"city\":\"Tokyo\"}",
  "extensions": {
    "google": {
      "thought_signature": "c2lnbmF0dXJlLTEyMw"
    }
  }
}

Special handling:

When Gemini returns thought_signature as bytes, it is converted to URL-safe base64 without padding before being exposed in extensions.google.thought_signature.
The client caches thought_signature by call_id in memory.
When a later OpenResponses input contains the same function_call again, the client restores the Gemini thought_signature bytes from extensions.google.thought_signature.
If that extension is omitted on replay, the client falls back to the cached value for the same call_id.

Implications:

Multi-turn tool flows should keep using the same GeminiResponsesClient instance.
If you persist tool state outside the process, persist both the OpenResponses item and extensions.google.thought_signature when present.
Invalid thought_signature payloads raise ValueError.

Native Gemini turn history is cached for tool follow-up

Gemini parallel tool calls are not stable if provider-native turns are flattened into independent OpenResponses items and then replayed one by one.

In particular, Gemini may emit multiple function_call parts in a single model turn, while only the first part carries thought_signature. If a later follow-up reconstructs those calls as separate Gemini turns, Gemini can reject the replay with:

400 INVALID_ARGUMENT
Function call is missing a thought_signature

To avoid that, this client now keeps an in-memory cache of native Gemini contents history inside the GeminiResponsesClient instance.

Current behavior:

The first request is converted from OpenResponses input into Gemini contents as usual.
The native Gemini response turn (candidate.content) is appended to the same in-memory history.
Later follow-up requests are treated as delta input only.
Consecutive OpenResponses function_call items are regrouped into one Gemini ModelContent(parts=[...]).
Consecutive OpenResponses function_call_output items are regrouped into one Gemini UserContent(parts=[...]).
The actual Gemini request is built from cached native history + current delta, not from a stateless replay of flattened OpenResponses history.
Sticky system/developer context is cached separately from contents, and the effective system_instruction is rebuilt on each turn from current instructions plus the cached sticky context.

Implications:

Multi-turn Gemini tool loops must reuse the same GeminiResponsesClient instance.
If your application creates a new client for every follow-up turn, Gemini tool replay may still fail.
previous_response_id is still not used as a stateless resume mechanism for Gemini in this package.
parallel_tool_calls remains an OpenResponses-level flag; the safety here comes from preserving Gemini-native turn structure, not from the flag itself.

Media input is normalized best-effort

OpenResponses input_image, input_file, and input_video are automatically converted to Gemini API format as follows:

1. `input_image` / `input_video` with Data URI format (URIs starting with `data:`)

# Example: data:image/png;base64,iVBORw0KGgoAAAANSUhEUg...
Message(
    role="user",
    content=[
        InputImage(image_url="data:image/png;base64,iVBORw0KGgoAAAANSUhEUg...")
    ]
)

Base64-encoded data is decoded and sent as byte array
Uses Gemini API's types.Part.from_bytes(data=..., mime_type=...)
Use case: Embedding small images/videos directly in requests

2. `input_file` with `file_data` (raw base64 data)

Message(
    role="user",
    content=[
        InputFile(
            filename="sample.pdf",
            file_data="JVBERi0xLjQKJ..."
        )
    ]
)

file_data is treated as raw base64 data, not as a data: URI
Base64 data is decoded and sent as inline bytes
Uses Gemini API's types.Part.from_bytes(data=..., mime_type=...)
MIME type is inferred from filename; falls back to application/octet-stream

3. `input_file` with `file_url`, or URI-based `input_video`

# Example: gs://bucket/video.mp4, https://example.com/image.jpg
Message(
    role="user",
    content=[
        InputVideo(video_url="gs://my-bucket/video.mp4")
    ]
)

URIs are sent as-is as references
Uses Gemini API's types.Part.from_uri(file_uri=..., mime_type=...)
Use case: Referencing files on GCS, YouTube videos, external URLs

4. Automatic MIME type inference

MIME type is automatically inferred from URI or filename extension (e.g., .mp4 → video/mp4)
Falls back to application/octet-stream if inference fails
Warning log is emitted on fallback

5. Files uploaded via Files API

# Pre-upload using Google GenAI SDK
video_file = genai_client.files.upload(file="path/to/video.mp4")
# Pass directly to input
payload = CreateResponseBody(input=[video_file, ...])

Uploaded File objects can be included directly in input
Gemini API handles them appropriately internally

Reasoning is mapped approximately

OpenResponses reasoning settings do not map 1:1 to Gemini.

Current behavior:

reasoning.effort="none" -> thinking_budget=0
reasoning.effort="low" | "medium" | "high" -> Gemini thinking_level
reasoning.effort="xhigh" -> mapped to HIGH with warning
reasoning.summary -> ignored with warning

Response fields are partially synthesized

Gemini does not return an object identical to ResponseResource, so some fields are reconstructed from:

request payload
Gemini response metadata
client-generated fallback IDs and timestamps

Examples:

id falls back to a synthetic response ID if Gemini does not return response_id
tools, tool_choice, text, service_tier, and similar fields are echoed from the effective request
Gemini-specific metadata is stored under metadata["gemini_*"]

OpenResponses Compatibility Notes

This client is intentionally best-effort. It keeps the OpenResponses public interface, but not every field can be represented natively by Gemini.

Supported well

plain text input
user / assistant message history
function tools
Gemini built-in tools expressed as OpenAI-style flat tool objects
non-streaming responses
basic streaming text responses
JSON-schema-style structured output via Gemini response_json_schema

Supported with translation

instructions, system, developer -> merged system_instruction
function_call -> Gemini function_call
function_call_output -> Gemini function_response
reasoning text / thought parts -> OpenResponses reasoning
usage fields -> mapped from Gemini usage_metadata

Warning-and-ignore fields

These fields are preserved in the normalized ResponseResource when possible, but are not sent to Gemini as functional request controls:

previous_response_id
store
background
parallel_tool_calls
max_tool_calls
truncation
include
safety_identifier
prompt_cache_key

Unsupported or partially supported behavior

previous_response_id
- Gemini request execution ignores it.
- The field is kept in normalized responses for interface compatibility only.
generic tools
- Gemini built-in tools are converted dynamically when tool.type matches a supported google.genai.types.Tool field with object-style configuration.
- When at least one Gemini built-in tool is present, the client automatically sets tool_config.include_server_side_tool_invocations = true.
- Built-in tool config follows the OpenAI-style flat request shape, for example:
  - {"type": "google_maps", "enable_widget": true}
  - {"type": "file_search", "file_search_store_names": ["fileSearchStores/STORE_ID"], "top_k": 5}
  - {"type": "code_execution"}
- description is preserved in the echoed request, but is not sent to Gemini as executable tool config.
- Unknown generic tool types are ignored with warning.
- Known Gemini built-in tool types with invalid config fail fast with ValueError.
- Built-in outputs are still normalized best-effort; dedicated loops such as computer-use action roundtrips are not yet mapped to OpenResponses-specific output items.
item types without Gemini equivalents
- item_reference is ignored with warning.
- OpenResponses input reasoning items are ignored with warning.
exact tool-choice fidelity
- The client translates tool choice to Gemini function-calling config best-effort.
- OpenResponses/Core model serialization may already be lossy for some tool_choice shapes before the Gemini client sees them.

Streaming Semantics

Streaming is normalized to a minimal OpenResponses event set.

Currently emitted event families:

response.created
response.output_item.added
response.content_part.added
response.output_text.delta
response.output_text.done
response.content_part.done
response.output_item.done
response.reasoning.delta
response.reasoning.done
response.function_call_arguments.done
terminal event:
- response.completed
- response.incomplete
- response.failed
error

Notes:

Gemini stream chunks are merged cumulatively before event translation.
Delta calculation is prefix-based best-effort.
If Gemini emits unexpected chunk shapes, the client may emit an error event.

Status Mapping

Gemini finish state is mapped to OpenResponses status as follows:

prompt blocked / no candidate -> failed
MAX_TOKENS -> incomplete with reason="max_tokens"
STOP -> completed
response containing function calls -> completed
other Gemini finish reasons -> incomplete with the Gemini reason string

Authentication

You can pass google_api_key= directly, or rely on the Google SDK environment variable resolution.

Common environment variables:

GOOGLE_API_KEY
GEMINI_API_KEY

Logging Behavior

This client uses warnings for non-fatal incompatibilities.
You should expect warning logs when:

unsupported OpenResponses fields are provided
unsupported generic tool types are provided
unsupported item/content types are provided
MIME type inference falls back to application/octet-stream
reasoning.summary is requested
reasoning.effort="xhigh" is downgraded

When the logger for openresponses_impl_client_google.client.gemini_responses_client is set to DEBUG, the client also logs Gemini request and response payloads:

non-streaming requests before generate_content
non-streaming raw Gemini responses after generate_content
each streaming chunk from generate_content_stream
the final aggregated streaming payload before OpenResponses terminal events are emitted

Testing

Run tests with:

UV_CACHE_DIR="$PWD/.uv_cache" uv run pytest -q

This now includes live Gemini API integration tests under test/integration_test/.

Live test behavior:

the test module reads GOOGLE_API_KEY or GEMINI_API_KEY from the repo-root .env file if the variables are not already exported
if the key is missing or invalid, pytest fails immediately during test collection
normal pytest -q requires network access and may incur Gemini API cost
the live suite covers non-stream, stream, function-call follow-up, JSON schema output, and reasoning smoke paths

Summary

Use this package when you want Gemini behind the OpenResponses interface, but keep in mind:

it is interface-compatible, not wire-compatible
several OpenResponses controls are emulated or ignored
tool follow-up depends on client-local in-memory state
streaming is normalized to a practical subset rather than a perfect Gemini-to-Responses projection

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.9

May 1, 2026

0.2.8

Apr 30, 2026

0.2.7

Apr 17, 2026

0.2.6

Apr 17, 2026

0.2.5

Apr 16, 2026

0.2.4

Apr 16, 2026

0.2.3

Apr 16, 2026

0.2.2

Apr 15, 2026

0.2.1

Apr 14, 2026

0.2.0

Apr 14, 2026

0.1.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openresponses_impl_client_google-0.2.9.tar.gz (32.9 kB view details)

Uploaded May 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openresponses_impl_client_google-0.2.9-py3-none-any.whl (28.4 kB view details)

Uploaded May 1, 2026 Python 3

File details

Details for the file openresponses_impl_client_google-0.2.9.tar.gz.

File metadata

Download URL: openresponses_impl_client_google-0.2.9.tar.gz
Upload date: May 1, 2026
Size: 32.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for openresponses_impl_client_google-0.2.9.tar.gz
Algorithm	Hash digest
SHA256	`30a3352f1223c937704d5d2ec5f838a24ddcb2f0f481a160b481352093f6671c`
MD5	`7d432093cf7c76ef85c1ef4d54502f30`
BLAKE2b-256	`6a6391196ce52a1c30a03d70cd50754fc70408735a6ad25687dfd51a3971f272`

See more details on using hashes here.

File details

Details for the file openresponses_impl_client_google-0.2.9-py3-none-any.whl.

File metadata

Download URL: openresponses_impl_client_google-0.2.9-py3-none-any.whl
Upload date: May 1, 2026
Size: 28.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for openresponses_impl_client_google-0.2.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ddf485b4039e1299688d46b61ffff7d8a3669b85ddb181f065ec37e260a84791`
MD5	`b6916b7edede05484d8300a5c019c0d1`
BLAKE2b-256	`c755913598869306fce0f7a90c922a79b2f10e931f772c08160ec52ea75ed878`

See more details on using hashes here.

openresponses-impl-client-google 0.2.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

openresponses-impl-client-google

Overview

Installation

Basic Usage

Special Handling

System and developer instructions are merged

Tool follow-up requires the same client instance

thought_signature is normalized and replayed for Gemini tool turns

Native Gemini turn history is cached for tool follow-up

Media input is normalized best-effort

1. input_image / input_video with Data URI format (URIs starting with data:)

2. input_file with file_data (raw base64 data)

3. input_file with file_url, or URI-based input_video

4. Automatic MIME type inference

5. Files uploaded via Files API

Recommended method for video input (Files API)

Reasoning is mapped approximately

Response fields are partially synthesized

OpenResponses Compatibility Notes

Supported well

Supported with translation

Warning-and-ignore fields

Unsupported or partially supported behavior

Streaming Semantics

Status Mapping

Authentication

Logging Behavior

Testing

Summary

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`thought_signature` is normalized and replayed for Gemini tool turns

1. `input_image` / `input_video` with Data URI format (URIs starting with `data:`)

2. `input_file` with `file_data` (raw base64 data)

3. `input_file` with `file_url`, or URI-based `input_video`