Adaptive hedged request library for Python. Learns per-host latency via DDSketch, fires backup requests at estimated p90, caps hedge rate with token bucket.

These details have not been verified by PyPI

Project description

hedge-python

English | 简体中文 | 日本語

Python port of bhope/hedge — adaptive hedged requests for tail-latency optimisation.

hedge-python learns per-host latency distributions with DDSketch, races a backup request when the primary exceeds its estimated p90, and caps the hedge rate with a token bucket to prevent load amplification during outages. Zero configuration required. First-class support for httpx, aiohttp, and gRPC (unary + server-streaming).

Inspired by Dean & Barroso, The Tail at Scale (CACM 2013).

Why hedging?

A small fraction of slow responses dominates user-perceived latency. Hedging fires a duplicate request after the primary blows past its expected deadline — whichever finishes first wins, the other is cancelled.

Result on a benchmark with 5% straggler requests (10× slower):

Multi-framework benchmark

Framework	Configuration	p50	p90	p95	p99	p999	Overhead
httpx	No hedging	5.8	10.3	12.2	51.3	78.3	0.0%
httpx	Adaptive (hedge)	6.2	10.5	12.1	18.8	22.2	7.0%
aiohttp	No hedging	6.3	10.7	13.0	52.4	79.0	0.0%
aiohttp	Adaptive (hedge)	6.5	11.3	13.8	20.5	25.1	4.6%
grpc	No hedging	6.5	10.8	12.7	59.9	82.0	0.0%
grpc	Adaptive (hedge)	6.9	11.6	13.7	20.4	23.5	5.6%

Across all three frameworks, p99 latency drops by 60–66% at the cost of ~5–7% extra backend traffic. Reproduce with make bench-multi && make bench-plot.

Quick Start

# Install with your preferred framework
pip install hedge-python[httpx]
pip install hedge-python[aiohttp]
pip install hedge-python[grpc]
pip install hedge-python[all]   # all frameworks

httpx

import asyncio
import httpx
from hedge import HedgeConfig
from hedge.transport import HedgedHttpxTransport

async def main():
    transport = HedgedHttpxTransport(config=HedgeConfig())
    async with httpx.AsyncClient(transport=transport) as client:
        resp = await client.get("https://api.example.com/data")
        print(resp.status_code)

asyncio.run(main())

aiohttp

import asyncio
from hedge import HedgeConfig
from hedge.transport import HedgedAiohttpSession

async def main():
    async with HedgedAiohttpSession(config=HedgeConfig()) as session:
        resp = await session.get("https://api.example.com/data")
        data = await resp.json()
        print(data)

asyncio.run(main())

gRPC (Unary)

import grpc.aio
from hedge import HedgeConfig
from hedge.interceptor import HedgedUnaryInterceptor

async def make_channel():
    return grpc.aio.insecure_channel(
        "localhost:50051",
        interceptors=[HedgedUnaryInterceptor(config=HedgeConfig(estimated_rps=500))],
    )

gRPC (Server Streaming — LLM inference, log tailing, …)

import grpc.aio
from hedge import HedgeConfig
from hedge.interceptor import HedgedServerStreamInterceptor

async def make_channel():
    return grpc.aio.insecure_channel(
        "localhost:50051",
        interceptors=[HedgedServerStreamInterceptor(config=HedgeConfig())],
    )

For server streaming, the hedge signal is time-to-first-message (TTFM): if the primary stream doesn't yield its first chunk within the estimated p90, a backup stream is started. Whichever yields first wins and continues streaming; the loser is cancelled at the wire level.

Runnable examples for each framework live in examples/ — the gRPC ones are fully self-contained (they spin up a local server with simulated stragglers so you can see hedging in action without any external dependency). See examples/README.md for the index.

How It Works

1. DDSketch quantile estimator

Each target host gets a WindowedSketch — a pair of DDSketches that rotate every 30 seconds. DDSketch uses logarithmic bucket mapping to provide relative-error guarantees: any quantile estimate is within ±1% of the true value, regardless of the underlying distribution.

2. Adaptive trigger

On each request, the transport queries the sketch for the configured percentile (default p90). If the primary hasn't responded by that deadline, a backup request is fired. Whichever response arrives first is returned; the loser is cancelled (including the underlying gRPC Call for streams).

              ┌─ primary  ─────────── ✓ (fast) ──→ return
request ──────┤
              └─ hedge fires after p90 ─── ✗ (cancelled)

3. Token bucket budget

Hedges are rate-limited by a token bucket that refills at estimated_rps × budget_percent / 100 tokens per second. During genuine outages the bucket drains and hedging stops automatically — preventing the load-doubling spiral that would deepen the incident.

gRPC implementation note

The gRPC intercept_unary_unary continuation returns a Call object almost immediately; the real RTT is spent in the subsequent await call. We wrap both steps in a single asyncio task so the hedge timer reflects true end-to-end RPC latency. Cancelling a loser invokes call.cancel() first (notifying the server) then task.cancel() (cleaning up the coroutine).

Configuration

All knobs live on HedgeConfig:

Parameter	Type	Default	Description
`percentile`	`float`	`0.90`	Sketch quantile used as hedge trigger
`max_hedges`	`int`	`1`	Maximum concurrent hedge requests per call
`budget_percent`	`float`	`10.0`	Max hedge rate as percent of total traffic
`estimated_rps`	`float`	`100.0`	Expected requests per second; sets token bucket capacity
`min_delay`	`float`	`0.001`	Floor on the hedge delay in seconds
`warmup_requests`	`int`	`20`	Number of initial requests using fixed delay
`warmup_delay`	`float`	`0.01`	Fixed hedge delay during warmup in seconds
`window_duration`	`float`	`30.0`	Sketch window rotation interval in seconds
`stats`	`Stats \| None`	`None`	Inject a custom `Stats` for observability

Tip — estimated_rps: pick a value close to your real RPS so the token bucket capacity (rps × budget_percent / 100) is meaningful. If unsure, start at the default 100.0 and watch hedge_rate / budget_exhausted in the stats snapshot.

Observability

from hedge import HedgeConfig, Stats
from hedge.transport import HedgedHttpxTransport

stats = Stats()
transport = HedgedHttpxTransport(config=HedgeConfig(stats=stats))

# ... after running some traffic ...
snap = stats.snapshot()
print(f"total={snap.total_requests} hedged={snap.hedged_requests}")
print(f"hedge_wins={snap.hedge_wins} primary_wins={snap.primary_wins}")
print(f"budget_exhausted={snap.budget_exhausted}")
print(f"hedge_rate={stats.hedge_rate():.2%}")

Stats is fully thread-safe and can be shared across multiple transports/interceptors to aggregate metrics.

Benchmarks & charts

Two benchmark suites ship with the project:

Command	What it does	Output
`make bench-compare`	httpx only: No hedging vs Static 10ms vs Static 50ms vs Adaptive	`benchmark/results.csv`
`make bench-multi`	httpx vs aiohttp vs gRPC, No hedging vs Adaptive	`benchmark/results_multi.csv`
`make bench-plot`	Render both CSVs into charts	`eval.png`, `eval_multi_framework.png`

Each suite runs 500 requests against a simulated lognormal latency (mean=5ms, stddev=2ms) with 5% straggler probability (10× spike).

Development

# Install uv (if not already)
curl -LsSf https://astral.sh/uv/install.sh | sh

make install            # install all extras with uv
make lint               # ruff check
make typecheck          # mypy
make test               # all tests
make test-unit          # unit tests only
make test-integration   # integration tests (requires httpx / aiohttp / grpcio)
make coverage           # coverage report (current: 96%)
make bench-multi        # multi-framework benchmark
make bench-plot         # render charts
make ci                 # lint + typecheck + test + coverage

Testing

Unit tests (tests/unit/): DDSketch, token bucket, scheduler, stats, options, lazy import shims, gRPC interceptor branches (with fake continuations).
Integration tests (tests/integration/): real httpx transport, real aiohttp session, real local gRPC server with .proto + generated pb2.
Benchmarks (tests/benchmark/): DDSketch microbench, token bucket microbench, four-config comparison, three-framework comparison.

Current coverage: 97% (122 tests, ~7 seconds).

Project Structure

hedge-python/
├── src/hedge/
│   ├── __init__.py          # Public API
│   ├── _options.py          # HedgeConfig dataclass
│   ├── _stats.py            # Thread-safe Stats + StatsSnapshot
│   ├── sketch/
│   │   ├── _ddsketch.py     # DDSketch quantile estimator
│   │   └── _windowed.py     # Sliding-window DDSketch pair
│   ├── budget/
│   │   └── _token_bucket.py # Token bucket rate limiter
│   ├── transport/
│   │   ├── _base.py         # Shared HedgeScheduler logic
│   │   ├── _httpx.py        # httpx AsyncBaseTransport adapter
│   │   └── _aiohttp.py      # aiohttp session wrapper
│   └── interceptor/
│       └── _grpc.py         # gRPC unary + server-stream interceptors
├── tests/
│   ├── unit/                # 7 unit-test files
│   ├── integration/
│   │   ├── proto/           # .proto + generated pb2 / pb2_grpc
│   │   ├── test_httpx_transport.py
│   │   ├── test_aiohttp_session.py
│   │   └── test_grpc_interceptor.py
│   └── benchmark/
│       ├── test_bench_ddsketch.py
│       ├── test_bench_token_bucket.py
│       ├── test_bench_hedge_comparison.py    # httpx 4-config
│       └── test_bench_multi_framework.py     # 3-framework comparison
├── benchmark/
│   ├── plot.py              # CSV → matplotlib charts
│   ├── results.csv          # produced by bench-compare
│   └── results_multi.csv    # produced by bench-multi
├── eval.png                 # single-framework chart
├── eval_multi_framework.png # cross-framework chart
├── pyproject.toml
├── Makefile
└── .github/workflows/ci.yml

References

Jeffrey Dean and Luiz André Barroso. "The Tail at Scale." Communications of the ACM, 56(2):74–80, February 2013.
Charles Masson, Jee E. Rim, and Homin K. Lee. "DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees." Proceedings of the VLDB Endowment, 12(12):2195–2205, 2019.

Changelog

See CHANGELOG.md for the full release history.

License

hedge-python is released under the MIT License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hedge_python-0.1.0.tar.gz (445.1 kB view details)

Uploaded Apr 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hedge_python-0.1.0-py3-none-any.whl (22.0 kB view details)

Uploaded Apr 23, 2026 Python 3

File details

Details for the file hedge_python-0.1.0.tar.gz.

File metadata

Download URL: hedge_python-0.1.0.tar.gz
Upload date: Apr 23, 2026
Size: 445.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hedge_python-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7d7ed373836d76bb54d684331ad86c9a5d8eb947f7db1bd895bae712311ae680`
MD5	`ef2528dcb33176ea81cb4cebeb1ac99f`
BLAKE2b-256	`4a00e99af93944b704e59ff0b9cb26a21fd9dfb82ab1709a09a3d465a68b5bb5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hedge_python-0.1.0.tar.gz:

Publisher: release.yml on sunhailin-Leo/hedge-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hedge_python-0.1.0.tar.gz
- Subject digest: 7d7ed373836d76bb54d684331ad86c9a5d8eb947f7db1bd895bae712311ae680
- Sigstore transparency entry: 1364573616
- Sigstore integration time: Apr 23, 2026
Source repository:
- Permalink: sunhailin-Leo/hedge-python@ee9d79837d12c397f1455cd0832cd3611a63008a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sunhailin-Leo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ee9d79837d12c397f1455cd0832cd3611a63008a
- Trigger Event: push

File details

Details for the file hedge_python-0.1.0-py3-none-any.whl.

File metadata

Download URL: hedge_python-0.1.0-py3-none-any.whl
Upload date: Apr 23, 2026
Size: 22.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hedge_python-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7c65972b563671e19017c83a85049dcf3f01928671f148fc3a163ed303392ec5`
MD5	`08567a3a40a702007c6778d909f413cd`
BLAKE2b-256	`5df7599178eb585f3b1d6b755dc823349ed12634f493d5f965fac588fc209376`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hedge_python-0.1.0-py3-none-any.whl:

Publisher: release.yml on sunhailin-Leo/hedge-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hedge_python-0.1.0-py3-none-any.whl
- Subject digest: 7c65972b563671e19017c83a85049dcf3f01928671f148fc3a163ed303392ec5
- Sigstore transparency entry: 1364573703
- Sigstore integration time: Apr 23, 2026
Source repository:
- Permalink: sunhailin-Leo/hedge-python@ee9d79837d12c397f1455cd0832cd3611a63008a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sunhailin-Leo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ee9d79837d12c397f1455cd0832cd3611a63008a
- Trigger Event: push

hedge-python 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

hedge-python

Why hedging?

Quick Start

httpx

aiohttp

gRPC (Unary)

gRPC (Server Streaming — LLM inference, log tailing, …)

How It Works

1. DDSketch quantile estimator

2. Adaptive trigger

3. Token bucket budget

gRPC implementation note

Configuration

Observability

Benchmarks & charts

Development

Testing

Project Structure

References

Changelog

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance