Drop-in, voice-tuned OpenAI async client (HTTP/2 + warm connection reuse).
Project description
hopper-client
A drop-in, voice-tuned replacement for the OpenAI client. It has the same interface as OpenAI client with optimized config so that a voice agent stops paying a fresh TCP + TLS handshake on every turn.
Voice agents call an LLM once per conversational turn. With the
stock OpenAI client (HTTP/1.1, 5-second keep-alive) the connection is constantly
torn down and rebuilt between turns, conversational pauses exceed the keep-alive
window and hence you can't reuse a HTTP/1.1 connection. Each rebuild costs a TCP + TLS handshake, which is hundreds of milliseconds added to every turn. hopper-client
fixes this with a single warm connection that is reused across turns and interruptions.
# before
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="https://api.withhopper.com/v1", api_key="sk-...")
# after — same constructor, same methods
from hopper import AsyncHopper as AsyncOpenAI
client = AsyncOpenAI(base_url="https://api.withhopper.com/v1", api_key="sk-...")
Install
pip install hopper-client
# or, from source:
pip install -e .
Pulls openai and httpx[http2]
Parameters & reasoning
AsyncHopper has the identical constructor to AsyncOpenAI — every OpenAI
parameter is accepted and passed straight through. The only difference is the
default transport: when you don't supply your own http_client, Hopper builds
one tuned for voice. To override anything, pass your own http_client=.
| Setting | OpenAI default | Hopper default | Why |
|---|---|---|---|
http2 |
False |
True |
One connection multiplexes many streams. Cancelling a stream (barge-in) sends RST_STREAM and keeps the connection warm; under HTTP/1.1 a half-read connection can't be reused and is closed. This is the core fix. |
keepalive_expiry |
5.0s |
300s |
httpx reaps idle connections after this long. Voice turns are seconds apart (TTS playing, the user speaking); a 5s window means the next turn re-handshakes. 300s spans normal conversational gaps. Coordinate with your server/LB idle timeout — if the server closes the socket first, the warm pool fills with dead connections. |
max_keepalive_connections |
100 |
20 |
With HTTP/2 a single connection carries many streams, so the pool stays small. A small pool also bounds TCP head-of-line-blocking blast radius on lossy links. Raise it if one process drives many concurrent sessions. |
max_connections |
1000 |
100 |
The client shouldn't be the bottleneck; server-side admission control (429) is the real concurrency limit. With HTTP/2 you rarely approach this. |
connect timeout |
5.0s |
3.0s |
Fail a bad connection fast so the caller can retry/hedge instead of stalling a live turn. |
read / write / pool timeout |
600s |
60s |
read is the per-chunk gap (it bounds both TTFT and inter-token stalls), not a total cap. 60s is generous but far tighter than 600s. Override per-request for a strict turn budget. |
max_retries |
2 |
2 (unchanged) |
Left at the default for now. Note retries add latency silently; voice deployments may prefer 0–1 plus explicit hedging. |
Results
Measured with scripts/voice_sim.py against a live Qwen/Qwen3.6-35B-A3B
endpoint, comparing vanilla AsyncOpenAI vs AsyncHopper on the same model and
target. Vanilla opens one per
call, Hopper reuses a single warm connection throughout.
| Scenario | Calls | Vanilla conns | Hopper conns | Vanilla TTFT (median) | Hopper TTFT (median) |
|---|---|---|---|---|---|
| gaps — conversational pauses between turns | 7 | 7 | 1 | 510 ms | 197 ms |
| bargein — stream cancelled after first token | 9 | 9 | 1 | 512 ms | 193 ms |
| concurrent — 20 sessions × 3 turns | 61 | 61 | 1 | 600 ms | 319 ms |
| steady — back-to-back turns, no gaps (control) | 9 | 9 | 1 | 514 ms | 193 ms |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hopper_client-0.1.0.tar.gz.
File metadata
- Download URL: hopper_client-0.1.0.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e6ef49aba87660e93abb23c85bdade9c0eddf6e2bafc000056ef3314f63b532
|
|
| MD5 |
9c04a48460ec039a71e9241f7ad103e5
|
|
| BLAKE2b-256 |
b14af3f438eecf68f65f2aa238161b2fa159657c470d524ec39a11c99d6362fc
|
Provenance
The following attestation bundles were made for hopper_client-0.1.0.tar.gz:
Publisher:
publish.yml on jashwanth-12/hopper-client
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hopper_client-0.1.0.tar.gz -
Subject digest:
8e6ef49aba87660e93abb23c85bdade9c0eddf6e2bafc000056ef3314f63b532 - Sigstore transparency entry: 1870830192
- Sigstore integration time:
-
Permalink:
jashwanth-12/hopper-client@ba1f51c75b78329ab00186b116b72a8cf6e6afe9 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/jashwanth-12
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ba1f51c75b78329ab00186b116b72a8cf6e6afe9 -
Trigger Event:
release
-
Statement type:
File details
Details for the file hopper_client-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hopper_client-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b209c99565a896cbcf5442b782294e062f862716a69f77b40334e43f1545012a
|
|
| MD5 |
2d651fb1911420502b5a0c33ddd4cbdc
|
|
| BLAKE2b-256 |
dcc1216ef2675536ab8a1527fcd6689ab435f0b334352776cd1ae0dfd9561651
|
Provenance
The following attestation bundles were made for hopper_client-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on jashwanth-12/hopper-client
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hopper_client-0.1.0-py3-none-any.whl -
Subject digest:
b209c99565a896cbcf5442b782294e062f862716a69f77b40334e43f1545012a - Sigstore transparency entry: 1870830223
- Sigstore integration time:
-
Permalink:
jashwanth-12/hopper-client@ba1f51c75b78329ab00186b116b72a8cf6e6afe9 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/jashwanth-12
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ba1f51c75b78329ab00186b116b72a8cf6e6afe9 -
Trigger Event:
release
-
Statement type: