Demand-aware model routing for LiteLLM — route to models with accelerating adoption.
Project description
litellm-wzrd-momentum
Demand-aware model routing for LiteLLM.
Every LLM router optimizes on cost, latency, and quality. This adds a fourth axis: which models are gaining real-world traction right now?
The signal comes from WZRD — live velocity tracking across HuggingFace downloads, GitHub stars, and OpenRouter routing volume. Updated every 5 minutes.
Install
pip install litellm-wzrd-momentum
Quick start
from litellm import Router
from wzrd_momentum_strategy import register
router = Router(model_list=[
{"model_name": "qwen-9b", "litellm_params": {"model": "openrouter/qwen/qwen-3.5-9b"}},
{"model_name": "qwen-35b", "litellm_params": {"model": "openrouter/qwen/qwen-3.5-35b-a3b"}},
{"model_name": "llama-70b","litellm_params": {"model": "openrouter/meta-llama/llama-3.3-70b-instruct"}},
])
register(router, alias_map={
"qwen-9b": ["Qwen/Qwen3.5-9B"],
"qwen-35b": ["Qwen/Qwen3.5-35B-A3B"],
"llama-70b": ["meta-llama/Llama-3.3-70B-Instruct"],
})
# Every call now routes via momentum (sync)
response = router.completion(
model="qwen-9b",
messages=[{"role": "user", "content": "Hello"}],
)
# Or async
# response = await router.acompletion(model="qwen-9b", messages=[...])
How it works
- On each routing decision, fetches WZRD momentum signals (cached 5 min)
- Scores each deployment:
trend + momentum × 0.3 + delta × 0.25, weighted by confidence - Returns the highest-scoring deployment to LiteLLM
- LiteLLM handles retries, fallbacks, and provider errors as normal
If WZRD is unreachable, returns the first deployment. Your inference pipeline never breaks.
Behavior defaults
cache_ttl=300seconds (5 minutes)- confidence policy:
normal: full signal weight (eligible for proactive routing)low: half signal weight (observe-first posture)insufficient: zero signal weight (observe-only; no proactive push)
- fallback policy: if WZRD is down or payload contract drifts, route by deployment order (first candidate)
- contract guard: requires
contract_version(or legacysignal_version) and model-level fields (model,trend,score,confidence)
Score table
| Trend | Score | Signal |
|---|---|---|
| surging | +3.0 | Downloads/stars growing >50% day-over-day |
| accelerating | +2.0 | Growing 10-50% day-over-day |
| stable | 0.0 | Flat or <10% growth |
| decelerating | -1.0 | Slowing 5-30% day-over-day |
| cooling | -2.0 | Dropping >30% day-over-day |
Confidence scaling: normal = full weight, low = 50%, insufficient = 0% (new models with <3 days of data).
Alias mapping
WZRD tracks models by HuggingFace/GitHub name (Qwen/Qwen3.5-9B).
LiteLLM uses provider-specific names (openrouter/qwen/qwen-3.5-9b).
The alias_map bridges them explicitly. Without it, the strategy auto-matches
by extracting slugs from litellm_params.model — works for most cases, but
explicit mapping is more reliable.
register(router, alias_map={
"qwen-9b": ["Qwen/Qwen3.5-9B", "Qwen/Qwen3-9B"], # multiple variants
"llama-70b": ["meta-llama/Llama-3.3-70B-Instruct"],
})
Proxy integration
LiteLLM's proxy doesn't support custom strategies via YAML config. For proxy deployments, create a wrapper script:
# wzrd_proxy.py
import litellm
from litellm import Router
from wzrd_momentum_strategy import register
# Your normal proxy config
router = Router(model_list=[...])
register(router, alias_map={...})
# Start proxy with the patched router
from litellm.proxy.proxy_server import app
Or use the pre-router pattern from integrations/litellm-wzrd-router/ which
works as middleware before any LiteLLM call (SDK or proxy).
Manual setup
If you prefer explicit control over the register() convenience:
from wzrd_momentum_strategy import WZRDMomentumStrategy
strategy = WZRDMomentumStrategy(
router,
wzrd_url="https://api.twzrd.xyz/v1/signals/momentum",
alias_map={"qwen-9b": ["Qwen/Qwen3.5-9B"]},
cache_ttl=300,
)
router.set_custom_routing_strategy(strategy)
API
The momentum data comes from a public, free, no-auth endpoint:
GET https://api.twzrd.xyz/v1/signals/momentum
GET https://api.twzrd.xyz/v1/signals/momentum?platform=huggingface&trending=true
Returns trend classification, score, confidence, action, capabilities, and platform for 48+ tracked AI models.
Expected output (live sample)
For a candidate set like qwen-9b, nemotron-120b, llama-70b, expected behavior is:
- route to
nemotron-120bwhen it issurging - deprioritize
qwen-9bwhendecelerating - deprioritize
llama-70bwhencooling
The exact winner changes as momentum updates, but routing should follow trend and confidence consistently.
v0.1.0 release notes
- Added LiteLLM
CustomRoutingStrategyBaseplugin with one-line registration helper - Added trend + momentum + delta scoring with confidence weighting
- Added explicit alias map matching and automatic fallback matching from provider model slugs
- Added contract guard for WZRD payload shape (
signal_version+ required model fields) - Added graceful degradation fallback to first deployment when WZRD is unavailable
- Added test suite coverage for scoring order, confidence behavior, matching paths, async routing, caching behavior, register helper, and payload contract guard
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file litellm_wzrd_momentum-0.1.1.tar.gz.
File metadata
- Download URL: litellm_wzrd_momentum-0.1.1.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c921a9e58bed2a1e09116aec659de0c3b89c1ef09cb525aa606f37154684d4f9
|
|
| MD5 |
8c87a277c3b5865210797b6b1802b2a7
|
|
| BLAKE2b-256 |
edf8a6189586b1a1f513cfb96612738ed8d29896aa6d6689e19f26b6534fc9aa
|
File details
Details for the file litellm_wzrd_momentum-0.1.1-py3-none-any.whl.
File metadata
- Download URL: litellm_wzrd_momentum-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
660786142eff8fb6a5dfd5d4216678ed23092a912fce8d66fe63026b4b8f6bbf
|
|
| MD5 |
972bc7b18cf040362a84497cce44d48f
|
|
| BLAKE2b-256 |
e70a6d6bc58749291057789f552a8a5588f19195a4c9b5dbcd227ef83ee7951b
|