Demand-aware model routing for LiteLLM — route to models with accelerating adoption.

These details have not been verified by PyPI

Project links

Project description

litellm-wzrd-momentum

Demand-aware model routing for LiteLLM.

Every LLM router optimizes on cost, latency, and quality. This adds a fourth axis: which models are gaining real-world traction right now?

The signal comes from WZRD — live velocity tracking across HuggingFace downloads, GitHub stars, and OpenRouter routing volume. Updated every 5 minutes.

Install

pip install litellm-wzrd-momentum

Quick start

from litellm import Router
from wzrd_momentum_strategy import register

router = Router(model_list=[
    {"model_name": "qwen-9b",  "litellm_params": {"model": "openrouter/qwen/qwen-3.5-9b"}},
    {"model_name": "qwen-35b", "litellm_params": {"model": "openrouter/qwen/qwen-3.5-35b-a3b"}},
    {"model_name": "llama-70b","litellm_params": {"model": "openrouter/meta-llama/llama-3.3-70b-instruct"}},
])

register(router, alias_map={
    "qwen-9b":  ["Qwen/Qwen3.5-9B"],
    "qwen-35b": ["Qwen/Qwen3.5-35B-A3B"],
    "llama-70b": ["meta-llama/Llama-3.3-70B-Instruct"],
})

# Every call now routes via momentum (sync)
response = router.completion(
    model="qwen-9b",
    messages=[{"role": "user", "content": "Hello"}],
)

# Or async
# response = await router.acompletion(model="qwen-9b", messages=[...])

How it works

On each routing decision, fetches WZRD momentum signals (cached 5 min)
Scores each deployment: trend + momentum × 0.3 + delta × 0.25, weighted by confidence
Returns the highest-scoring deployment to LiteLLM
LiteLLM handles retries, fallbacks, and provider errors as normal

If WZRD is unreachable, returns the first deployment. Your inference pipeline never breaks.

Behavior defaults

cache_ttl=300 seconds (5 minutes)
confidence policy:
- normal: full signal weight (eligible for proactive routing)
- low: half signal weight (observe-first posture)
- insufficient: zero signal weight (observe-only; no proactive push)
fallback policy: if WZRD is down or payload contract drifts, route by deployment order (first candidate)
contract guard: requires contract_version (or legacy signal_version) and model-level fields (model, trend, score, confidence)

Score table

Trend	Score	Signal
surging	+3.0	Downloads/stars growing >50% day-over-day
accelerating	+2.0	Growing 10-50% day-over-day
stable	0.0	Flat or <10% growth
decelerating	-1.0	Slowing 5-30% day-over-day
cooling	-2.0	Dropping >30% day-over-day

Confidence scaling: normal = full weight, low = 50%, insufficient = 0% (new models with <3 days of data).

Alias mapping

WZRD tracks models by HuggingFace/GitHub name (Qwen/Qwen3.5-9B). LiteLLM uses provider-specific names (openrouter/qwen/qwen-3.5-9b).

The alias_map bridges them explicitly. Without it, the strategy auto-matches by extracting slugs from litellm_params.model — works for most cases, but explicit mapping is more reliable.

register(router, alias_map={
    "qwen-9b": ["Qwen/Qwen3.5-9B", "Qwen/Qwen3-9B"],  # multiple variants
    "llama-70b": ["meta-llama/Llama-3.3-70B-Instruct"],
})

Proxy integration

LiteLLM's proxy doesn't support custom strategies via YAML config. For proxy deployments, create a wrapper script:

# wzrd_proxy.py
import litellm
from litellm import Router
from wzrd_momentum_strategy import register

# Your normal proxy config
router = Router(model_list=[...])
register(router, alias_map={...})

# Start proxy with the patched router
from litellm.proxy.proxy_server import app

Or use the pre-router pattern from integrations/litellm-wzrd-router/ which works as middleware before any LiteLLM call (SDK or proxy).

Manual setup

If you prefer explicit control over the register() convenience:

from wzrd_momentum_strategy import WZRDMomentumStrategy

strategy = WZRDMomentumStrategy(
    router,
    wzrd_url="https://api.twzrd.xyz/v1/signals/momentum",
    alias_map={"qwen-9b": ["Qwen/Qwen3.5-9B"]},
    cache_ttl=300,
)
router.set_custom_routing_strategy(strategy)

API

The momentum data comes from a public, free, no-auth endpoint:

GET https://api.twzrd.xyz/v1/signals/momentum
GET https://api.twzrd.xyz/v1/signals/momentum?platform=huggingface&trending=true

Returns trend classification, score, confidence, action, capabilities, and platform for 48+ tracked AI models.

Expected output (live sample)

For a candidate set like qwen-9b, nemotron-120b, llama-70b, expected behavior is:

route to nemotron-120b when it is surging
deprioritize qwen-9b when decelerating
deprioritize llama-70b when cooling

The exact winner changes as momentum updates, but routing should follow trend and confidence consistently.

v0.1.0 release notes

Added LiteLLM CustomRoutingStrategyBase plugin with one-line registration helper
Added trend + momentum + delta scoring with confidence weighting
Added explicit alias map matching and automatic fallback matching from provider model slugs
Added contract guard for WZRD payload shape (signal_version + required model fields)
Added graceful degradation fallback to first deployment when WZRD is unavailable
Added test suite coverage for scoring order, confidence behavior, matching paths, async routing, caching behavior, register helper, and payload contract guard

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Apr 12, 2026

0.3.0

Apr 3, 2026

0.2.2

Mar 24, 2026

0.2.1

Mar 24, 2026

0.2.0

Mar 24, 2026

0.1.2

Mar 23, 2026

This version

0.1.1

Mar 23, 2026

0.1.0

Mar 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litellm_wzrd_momentum-0.1.1.tar.gz (12.8 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

litellm_wzrd_momentum-0.1.1-py3-none-any.whl (9.8 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file litellm_wzrd_momentum-0.1.1.tar.gz.

File metadata

Download URL: litellm_wzrd_momentum-0.1.1.tar.gz
Upload date: Mar 23, 2026
Size: 12.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for litellm_wzrd_momentum-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c921a9e58bed2a1e09116aec659de0c3b89c1ef09cb525aa606f37154684d4f9`
MD5	`8c87a277c3b5865210797b6b1802b2a7`
BLAKE2b-256	`edf8a6189586b1a1f513cfb96612738ed8d29896aa6d6689e19f26b6534fc9aa`

See more details on using hashes here.

File details

Details for the file litellm_wzrd_momentum-0.1.1-py3-none-any.whl.

File metadata

Download URL: litellm_wzrd_momentum-0.1.1-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for litellm_wzrd_momentum-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`660786142eff8fb6a5dfd5d4216678ed23092a912fce8d66fe63026b4b8f6bbf`
MD5	`972bc7b18cf040362a84497cce44d48f`
BLAKE2b-256	`e70a6d6bc58749291057789f552a8a5588f19195a4c9b5dbcd227ef83ee7951b`

See more details on using hashes here.

litellm-wzrd-momentum 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

litellm-wzrd-momentum

Install

Quick start

How it works

Behavior defaults

Score table

Alias mapping

Proxy integration

Manual setup

API

Expected output (live sample)

v0.1.0 release notes

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes