Skip to main content

Drop-in, voice-tuned OpenAI async client (HTTP/2 + warm connection reuse).

Project description

hopper-client

A drop-in, voice-tuned replacement for the OpenAI client. It has the same interface as OpenAI client with optimized config so that a voice agent stops paying a fresh TCP + TLS handshake on every turn.

Voice agents call an LLM once per conversational turn. With the stock OpenAI client (HTTP/1.1, 5-second keep-alive) the connection is constantly torn down and rebuilt between turns, conversational pauses exceed the keep-alive window and hence you can't reuse a HTTP/1.1 connection. Each rebuild costs a TCP + TLS handshake, which is hundreds of milliseconds added to every turn. hopper-client fixes this with a single warm connection that is reused across turns and interruptions.

# before
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="https://api.withhopper.com/v1", api_key="sk-...")

# after — same constructor, same methods
from hopper import AsyncHopper as AsyncOpenAI
client = AsyncOpenAI(base_url="https://api.withhopper.com/v1", api_key="sk-...")

Install

pip install hopper-client  
# or, from source:
pip install -e .

Pulls openai and httpx[http2]

Parameters & reasoning

AsyncHopper has the identical constructor to AsyncOpenAI — every OpenAI parameter is accepted and passed straight through. The only difference is the default transport: when you don't supply your own http_client, Hopper builds one tuned for voice. To override anything, pass your own http_client=.

Setting OpenAI default Hopper default Why
http2 False True One connection multiplexes many streams. Cancelling a stream (barge-in) sends RST_STREAM and keeps the connection warm; under HTTP/1.1 a half-read connection can't be reused and is closed. This is the core fix.
keepalive_expiry 5.0s 300s httpx reaps idle connections after this long. Voice turns are seconds apart (TTS playing, the user speaking); a 5s window means the next turn re-handshakes. 300s spans normal conversational gaps. Coordinate with your server/LB idle timeout — if the server closes the socket first, the warm pool fills with dead connections.
max_keepalive_connections 100 20 With HTTP/2 a single connection carries many streams, so the pool stays small. A small pool also bounds TCP head-of-line-blocking blast radius on lossy links. Raise it if one process drives many concurrent sessions.
max_connections 1000 100 The client shouldn't be the bottleneck; server-side admission control (429) is the real concurrency limit. With HTTP/2 you rarely approach this.
connect timeout 5.0s 3.0s Fail a bad connection fast so the caller can retry/hedge instead of stalling a live turn.
read / write / pool timeout 600s 60s read is the per-chunk gap (it bounds both TTFT and inter-token stalls), not a total cap. 60s is generous but far tighter than 600s. Override per-request for a strict turn budget.
max_retries 2 2 (unchanged) Left at the default for now. Note retries add latency silently; voice deployments may prefer 01 plus explicit hedging.

Results

Measured with scripts/voice_sim.py against a live Qwen/Qwen3.6-35B-A3B endpoint, comparing vanilla AsyncOpenAI vs AsyncHopper on the same model and target. Vanilla opens one per call, Hopper reuses a single warm connection throughout.

Scenario Calls Vanilla conns Hopper conns Vanilla TTFT (median) Hopper TTFT (median)
gaps — conversational pauses between turns 7 7 1 510 ms 197 ms
bargein — stream cancelled after first token 9 9 1 512 ms 193 ms
concurrent — 20 sessions × 3 turns 61 61 1 600 ms 319 ms
steady — back-to-back turns, no gaps (control) 9 9 1 514 ms 193 ms

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hopper_client-0.1.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hopper_client-0.1.0-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file hopper_client-0.1.0.tar.gz.

File metadata

  • Download URL: hopper_client-0.1.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hopper_client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8e6ef49aba87660e93abb23c85bdade9c0eddf6e2bafc000056ef3314f63b532
MD5 9c04a48460ec039a71e9241f7ad103e5
BLAKE2b-256 b14af3f438eecf68f65f2aa238161b2fa159657c470d524ec39a11c99d6362fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for hopper_client-0.1.0.tar.gz:

Publisher: publish.yml on jashwanth-12/hopper-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hopper_client-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hopper_client-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hopper_client-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b209c99565a896cbcf5442b782294e062f862716a69f77b40334e43f1545012a
MD5 2d651fb1911420502b5a0c33ddd4cbdc
BLAKE2b-256 dcc1216ef2675536ab8a1527fcd6689ab435f0b334352776cd1ae0dfd9561651

See more details on using hashes here.

Provenance

The following attestation bundles were made for hopper_client-0.1.0-py3-none-any.whl:

Publisher: publish.yml on jashwanth-12/hopper-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page