A deterministic prompt-complexity router: score a prompt's structure and recommend a local or cloud model — offline, reproducible, no model call.
Project description
A deterministic prompt-complexity router. Hand it a prompt, get back a reproducible structural complexity score and a recommendation:
route this prompt to your local model, or to the cloud model?
It is a standalone tool. It calls no model, needs no API key, makes no network request, and has zero dependency on RAC — it is pure text scanning plus a threshold. The recommendation is a fact you act on; Wayfinder stops there, and the caller runs inference.
Quickstart (gateway)
Put Wayfinder in front of your models — your app keeps using the OpenAI API, you
just change base_url. Pilot-facing one-pager: EXPLAINER.md.
-
Describe your two models in
wayfinder-router.toml:[routing] threshold = 0.5 # below -> local, at/above -> cloud [gateway.models.local] base_url = "http://localhost:11434/v1" model = "llama3.2" [gateway.models.cloud] base_url = "https://api.openai.com/v1" model = "gpt-4o" api_key_env = "OPENAI_API_KEY" # key read from this env var, never stored
-
Run the gateway:
pip install "wayfinder-router[gateway]" export OPENAI_API_KEY=sk-... wayfinder-router serve --port 8088
-
Point your existing client at it — no code change:
client = openai.OpenAI(base_url="http://localhost:8088/v1", api_key="unused") client.chat.completions.create(model="auto", messages=[{"role": "user", "content": "..."}])
Easy prompts go to local, hard ones to cloud; each response carries
x-wayfinder-router-model and x-wayfinder-router-score so you can see the routing.
Need to steer one request? A client can pin it (model="cloud" /
prefer-local) or move the cut per call (an X-Wayfinder-Threshold header) —
see Steer a single request.
Check it's working (the headers show where each request went):
curl -s localhost:8088/healthz # {"status":"ok","models":["cloud","local"]}
curl -s -D - -o /dev/null http://localhost:8088/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"auto","messages":[{"role":"user","content":"hi"}]}' \
| grep -i x-wayfinder-router
# x-wayfinder-router-model: local
# x-wayfinder-router-score: 0.00
Try it in 30 seconds, no backends.
wayfinder-router serve --dry-runanswers/v1/chat/completionswith the routing decision (model, score, mode) instead of calling an upstream — point a client at it to feel the routing before wiring real models.What's next:
wayfinder-router route prompt.md --explainshows why a prompt scored where it did;wayfinder-router uiopens the tuning console; then collect a handful of local-vs-hosted judgments andwayfinder-router calibrateto fit the cut to your traffic. The default threshold is only a starting point — the score is a structural proxy (length, headings, lists, code), not a verdict on semantic difficulty, so a short hard prompt scores low. Calibration is what makes it yours.
Where Wayfinder sits
Wayfinder ships no end-user interface — it is middleware that sits behind
whatever OpenAI-compatible client you already use. You point that client's
base_url at the gateway once; from then on Wayfinder is invisible, and the
same interface serves a request whether it routes local or hosted:
You (a chat app / IDE / agent / your own code)
│ one OpenAI-compatible request — base_url -> the gateway
v
Wayfinder gateway -- scores the prompt, picks local vs cloud --+
| |
| local hosted |
v v
Ollama / LM Studio / vLLM OpenAI / Together / any hosted API
(an OpenAI-compatible /v1) (an OpenAI-compatible /v1)
| |
+---------------- response flows back up ---------------------+
v
You -- same client, same response, plus the x-wayfinder-router-* headers
- The interface in front is yours to choose — a chat GUI (e.g. Open WebUI, LibreChat), an IDE assistant that allows a custom endpoint (Cursor, Continue), an agent framework (LangChain, LlamaIndex), or your own app on the OpenAI SDK. Want a turnkey chat window? Put Open WebUI in front and point it at the gateway.
- Local and hosted are backends, not UIs. The "local model" is a server
(Ollama, LM Studio, vLLM, llama.cpp) exposing an OpenAI-compatible
/v1; the hosted one is the same shape. Wayfinder forwards to whichever it picked, and the completion returns through the same client — the user never switches UIs and usually never knows which model answered (the response headers say, if you care). - The
wayfinder-router uiconsole is not this chat surface — it is the operator's tuning view (score a prompt, calibrate, edit config), never the path production traffic takes.
Why deterministic
The obvious way to route by complexity is to ask a model how complex the prompt
is — an LLM-as-judge router. That is non-deterministic, costs a model call to
decide whether to make a model call, and cannot be reproduced or tested.
Wayfinder takes the opposite stance: it scores structure — length, headings,
instruction steps, links, code blocks, tables — combines the signals into a
bounded 0.0–1.0 score, and compares that to a threshold you control. Same
prompt and same threshold always give the same answer.
The score is a structural proxy, not a verdict on difficulty: whether it tracks "this prompt needs the cloud model" is your calibration, which is exactly why the threshold is yours to set.
How it compares
Most LLM routers decide by calling a model — a trained classifier (RouteLLM), an LLM-as-judge, or a hosted service (NotDiamond, Martian, OpenRouter's auto-router) — which adds latency, cost, and non-determinism to the routing decision itself. Wayfinder is the one that decides by scanning text structure: offline, in microseconds, with no model call, and calibrated on your own traffic.
| router | decides by | model call to decide? | offline / self-host | calibrate on your data |
|---|---|---|---|---|
| Wayfinder | deterministic structural score | no | yes | yes |
| RouteLLM | trained classifier (preference data) | yes | yes | retrain |
| NotDiamond / Martian | learned, hosted | yes | no | via platform |
| OpenRouter (Auto) | hosted auto-router | yes | no | — |
| LiteLLM | provider proxy (not complexity-routed) | no | yes | n/a |
The claim is precise — not "best accuracy", but the only offline, zero-model-call,
calibrate-on-your-data, self-hosted structural router. The trade-off is honest too: a
structural proxy can't tell a short-but-hard prompt from a short-easy one, so it won't
match a semantic router there. The reproducible benchmark
(make benchmark) reports the full cost-quality curve, honest baselines (a tuned length
heuristic is competitive on short prompts), and where Wayfinder sits versus an oracle —
on a small illustrative set you can swap for RouterBench
or RouterArena (WF-ADR-0015).
Run it (offline, no install)
cd wayfinder-router
echo "Summarise this paragraph in one sentence." | python -m wayfinder_router.cli route -
make route PROMPT=path/to/prompt.md
Recommended Model: local
Complexity Score: 0.00 (mode: tiered)
Tiers:
>= 0.00 local <-
>= 0.50 cloud
Contributing Features:
Word Count: 6
...
JSON for machine consumers (an agent reads this and routes to its own model):
wayfinder-router route prompt.md --json
{
"schema_version": "2",
"score": 0.66,
"recommendation": "cloud",
"mode": "tiered",
"features": { "word_count": 545, "heading_count": 12, "...": 0 },
"tiers": [{ "min_score": 0.0, "model": "local" }, { "min_score": 0.5, "model": "cloud" }]
}
Install
pip install "wayfinder-router[gateway]" # route traffic through the OpenAI-compatible gateway (the common case)
pip install wayfinder-router # core only: scorer + CLI + Python API, zero deps (you route in your own code)
pip install "wayfinder-router[ui]" # add the local calibration/explain/configure UI
pip install "wayfinder-router[all]" # gateway + UI together
Configure routing
Wayfinder reads its own config — never RAC's .rac/. Drop a wayfinder-router.toml
anywhere at or above where you run it. Three modes, in precedence order
(classifier > tiers > threshold); weights (the scalar-score weights) apply to
any of them.
Binary (the default) — one cut:
[routing]
threshold = 0.6
weights = { word_count = 4.0, list_item_count = 2.5 }
--threshold N overrides it for one run; WAYFINDER_ROUTER_THRESHOLD overrides via the
environment.
Tiered (WF-ADR-0002) — ordered score bands route to any number of models:
[[routing.tiers]]
min_score = 0.0
model = "llama-3b"
[[routing.tiers]]
min_score = 0.3
model = "llama-70b"
[[routing.tiers]]
min_score = 0.6
model = "claude-cloud"
Classifier (WF-ADR-0003) — a fitted multinomial-logistic model; argmax over
per-model linear scores. Usually produced by calibrate, not hand-written.
Calibrate from data
The cut is a proxy; calibrate it against your traffic. wayfinder-router calibrate
reads a labeled JSONL dataset ({"text": ..., "label": ...}) and emits a config
fragment — offline, deterministic, and it never calls a model (labels come from
your own oracle):
wayfinder-router calibrate data.jsonl --mode threshold # sweep the binary cut
wayfinder-router calibrate data.jsonl --mode tiers # ordinal multi-model
wayfinder-router calibrate data.jsonl --mode classifier --out wayfinder-router.toml
The emitted fragment drops straight into wayfinder-router.toml; the summary (accuracy,
chosen breakpoints) is printed to stderr. The classifier is fit by deterministic
L2-regularized Newton/IRLS — pure Python, converging in a handful of iterations.
Route with your own key (gateway)
To actually route — score the prompt, then call the chosen model with your own
key — run the OpenAI-compatible gateway (WF-ADR-0004). Your existing client points
its base_url at Wayfinder; no application code changes.
# wayfinder-router.toml — map each routed model name to an upstream + a key env var.
[routing]
threshold = 0.6
[gateway.models.local]
base_url = "http://localhost:11434/v1"
model = "llama3.2"
[gateway.models.cloud]
base_url = "https://api.example.com/v1"
model = "big-model"
api_key_env = "EXAMPLE_API_KEY" # the *name* of the env var; the secret is never in this file
pip install "wayfinder-router[gateway]"
export EXAMPLE_API_KEY=... # read at request time, only inside the gateway
wayfinder-router serve --port 8088
import openai
client = openai.OpenAI(base_url="http://localhost:8088/v1", api_key="unused")
client.chat.completions.create(model="auto", messages=[{"role": "user", "content": "..."}])
# Wayfinder scores the prompt, forwards to local or cloud, and returns the response.
# Response headers carry x-wayfinder-router-model and x-wayfinder-router-score.
The gateway is the only part that touches keys or the network; the scorer,
config, and calibrator stay pure, offline, and deterministic. Keys are read from
the environment at request time and never enter wayfinder-router.toml or the scored
path.
Steer a single request (override)
The deployment's wayfinder-router.toml sets the default boundary, but a client
can override the decision for one request — no application change, plain
OpenAI-compatible transport (WF-ADR-0011). An override only changes where the
request is forwarded; the prompt is still scored deterministically, and no
override adds a model call.
- The
modelfield is a routing directive.auto(or any ordinary model id) lets Wayfinder decide; a configured endpoint name (local,cloud, …) pins the request to that endpoint;prefer-local/prefer-hostedpin to the low / high end of your router (prefer-cloudstill works as an alias ofprefer-hosted). - An
X-Wayfinder-Thresholdheader re-cuts the decision for that request — a number in0.0–1.0, reusing your configured weights (binary routers only).
# Pin one call to cloud regardless of score:
client.chat.completions.create(model="cloud", messages=[...])
# Or move the cut for one call (keep model="auto"):
client.chat.completions.create(
model="auto", messages=[...], extra_headers={"X-Wayfinder-Threshold": "0.8"}
)
Each response adds x-wayfinder-router-mode (scored / pinned /
threshold-override) alongside the x-wayfinder-router-model / -score headers,
so you can see which channel decided the route.
Use it from a chat UI (no fork)
Because the model field is a routing directive (above), any OpenAI-compatible
chat UI can drive routing with no code change: the app's normal model dropdown
becomes a per-conversation routing-mode picker (auto / prefer-local /
prefer-hosted / a pinned endpoint). The gateway advertises these over
GET /v1/models, so a UI discovers them automatically — no hand-written list.
- LibreChat — copy
examples/librechat.yamlandexamples/docker-compose.override.ymlinto your LibreChat checkout anddocker compose up; pick the "Wayfinder" endpoint. - Open WebUI — add an OpenAI connection pointing at the gateway; it
auto-discovers the routing options from
/v1/models.
See examples/ for both recipes. A live per-conversation threshold
slider is the one thing a stock UI can't express — that's what the wayfinder-chat
fork adds (WF-ADR-0010); this is the no-fork path that proves it out first.
Seeing routing (is it working?)
Wayfinder's control surface is distributed across the tools you already run, so it's easy to not notice it working. Four places show or steer routing:
- The model dropdown in your client is the routing-mode picker (
auto/prefer-local/prefer-hosted/ a pinned endpoint), auto-populated fromGET /v1/models. - Response headers —
x-wayfinder-router-model/-score/-mode/-request-id— say where each request went and why. X-Wayfinder-Debug: true(opt-in) surfaces the decision in the response body (awayfinderobject), for clients that render JSON but hide headers. The default response stays byte-clean.- A read-only dashboard at
http://localhost:8088/router(JSON at/router/recent) shows the last decisions, a per-model count, and scores at a glance — metadata only, never prompt text. It's distinct from thewayfinder-router uioperator console, which is off the traffic path (WF-ADR-0014).
The threshold header is the one fine control no stock client exposes; a
per-conversation slider is what the wayfinder-chat fork (WF-ADR-0010) adds.
Learn from feedback (onboarding)
Don't guess the cut — learn it from your own judgment of local vs hosted output (WF-ADR-0006). The loop is: collect judgments → calibrate → route automatically.
Bootstrap with A/B onboarding. For each sample prompt, wayfinder-router onboard runs
both arms and asks which was good enough; the answer is a label:
wayfinder-router onboard prompts.jsonl --arms local,cloud --calibrate > wayfinder-router.toml
The A/B comparison and the prompt go to stderr; --calibrate prints the resulting
config to stdout. Each judgment appends a {"text", "label"} line to a feedback
log — which is the calibrate dataset, so the log turns straight into a config.
Keep it honest with steady-state feedback. Once routing automatically, record which model was actually good enough; the label feeds the next recalibration:
curl localhost:8088/v1/feedback -d '{"text": "...", "label": "cloud"}'
Recalibrate on a schedule (WF-ADR-0007). Re-fit the routing config from the
log — run it from cron / a k8s CronJob, or click "Recalibrate & save" in the UI's
Onboard tab. It rewrites only the [routing] section and preserves your
[gateway] endpoints; a running gateway hot-reloads the new config with no
restart:
wayfinder-router recalibrate # log → calibrate → write wayfinder-router.toml
wayfinder-router recalibrate --min-labels 50 # no-op until you have enough signal
The judging runs models, so it lives in the gateway/invocation layer (BYO key); the deterministic core is untouched and the label log carries no secrets.
Deploy & integrate (WF-ADR-0008)
Wayfinder doesn't only work from the CLI — the CLI, onboarding, and UI are the operator/bootstrap surfaces. In production, prompts flow through the gateway (transparent) or the library (in-process); routing happens where prompts already are, not by re-typing them.
Run the gateway as a service (sidecar or standalone):
docker build -t wayfinder-router . && docker run -p 8088:8088 -v "$PWD/data:/data" wayfinder-router
# or: docker compose up gateway (see docker-compose.example.yml)
Point your existing client at it — no app code change. Anything that speaks
the OpenAI API takes a base_url:
client = openai.OpenAI(base_url="http://localhost:8088/v1", api_key="unused")
The same base_url works for agent frameworks (LangChain/LlamaIndex), IDE
assistants that allow a custom endpoint (Cursor, Continue), or a gateway like
LiteLLM. Wayfinder scores each incoming prompt and forwards to the chosen model
with your key.
Wire feedback from the host surface. Your app/IDE/chat decides how to show a 👍/👎 and posts the judgment; Wayfinder records it and the next recalibration learns from it:
fetch("http://localhost:8088/v1/feedback", {
method: "POST",
body: JSON.stringify({ text: prompt, label: wasGoodEnough ? "local" : "cloud" }),
});
Schedule recalibration with cron / a k8s CronJob (or docker compose run --rm recalibrate); the gateway hot-reloads the result. Keys always come from the
environment (each model's api_key_env) — never the image or the config file.
Production behaviour (WF-ADR-0013). The gateway forwards asynchronously and
streams: a request with stream: true is relayed back as Server-Sent-Events so
chat clients render tokens progressively. An upstream timeout or connection failure
returns an OpenAI-shaped wayfinder_router_upstream_error (not a bare 500), every
response carries an x-wayfinder-router-request-id for tracing, and routing decisions
and config-reload failures are logged. Tunables (env or flags):
WAYFINDER_ROUTER_TIMEOUT/serve --timeout— upstream timeout in seconds (default 60).WAYFINDER_ROUTER_FEEDBACK_TOKEN— when set,/v1/feedbackrequiresAuthorization: Bearer <token>(otherwise the label log is an open write).serve --dry-run— return routing decisions without calling any upstream.GET /healthzreportsdegradedand listsmissing_keyswhen a configuredapi_key_envis unset.GET /routeris a read-only dashboard of recent decisions;X-Wayfinder-Debug: truesurfaces the decision in the response body (WF-ADR-0014).
Explain & tune
To see why a prompt routed where, ask for the per-feature breakdown — each feature's value, its normalized level, its weight, and its share of the score:
wayfinder-router route prompt.md --explain
For interactive tuning there's a local web UI (WF-ADR-0005) with three tabs:
- Explain — paste a prompt; see the score, tier ladder, and contribution bars, and drag a threshold slider to watch routing change live.
- Calibrate — paste a labeled JSONL dataset; run a mode; see accuracy, the threshold-sweep curve, and the resulting config fragment, then send it to Configure.
- Configure — edit
wayfinder-router.tomlwith live validation (the real loaders) and save. - Onboard — A/B a local vs hosted model on sample prompts in the browser,
judge each, record labels, then calibrate from the log (needs
[gateway]too, for the model calls).
pip install "wayfinder-router[ui]"
wayfinder-router ui --port 8099 # then open http://localhost:8099
The UI is a thin consumer of the same pure functions; it never calls a model, and
no secret ever appears in it (a gateway model names an api_key_env; the key
lives in the environment).
Python API
from wayfinder_router import score_complexity, RoutingConfig, explain_score
result = score_complexity(prompt_text, config=RoutingConfig.binary(threshold=0.7))
print(result.recommendation, result.score, result.features)
for fc in explain_score(result.features, RoutingConfig().weights):
print(fc.name, fc.contribution)
Heritage
Wayfinder began as the rac route exploration inside
requirements-as-code, and
its scoring shape is inspired by RAC's deterministic classification.py
(points / ceiling). It was split out because routing is a runtime inference
concern, divergent from RAC/Lore's recorded-knowledge product line — a prompt
router should not require installing a requirements-as-code engine. The shipped
tool shares no runtime code with RAC; see decisions/WF-ADR-0001.
Repository layout
wayfinder-router/
wayfinder_router/ the package: complexity scorer, tiers + classifier, own config
loader + writer, offline calibration (Newton/IRLS), explain, the
feedback log + onboarding harness, recalibration, CLI, and the
optional OpenAI-compatible gateway and local UI (impure layers,
behind their extras)
tests/ scorer, config, calibration, explain, feedback, onboard,
recalibrate, CLI, gateway, and UI coverage
decisions/ ADRs grounding the tool's own choices (dogfooded)
Dockerfile, docker-compose.example.yml deploy the gateway as a service
Test
pip install -e .[dev] # or: pip install pytest
make test
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wayfinder_router-0.1.6.tar.gz.
File metadata
- Download URL: wayfinder_router-0.1.6.tar.gz
- Upload date:
- Size: 73.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f658a08cf5059b123fa2cf7f4db4a7b26483c20b5a6f52cf3062d09ce8c8e2e0
|
|
| MD5 |
f9c5b62035571f9ff49cc4b13725e89a
|
|
| BLAKE2b-256 |
bdd23a3c22eb7b93938c71902632e3fc6c98109907409f8903542d24ea717c3d
|
Provenance
The following attestation bundles were made for wayfinder_router-0.1.6.tar.gz:
Publisher:
release.yml on itsthelore/wayfinder-router
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wayfinder_router-0.1.6.tar.gz -
Subject digest:
f658a08cf5059b123fa2cf7f4db4a7b26483c20b5a6f52cf3062d09ce8c8e2e0 - Sigstore transparency entry: 1867802141
- Sigstore integration time:
-
Permalink:
itsthelore/wayfinder-router@da1ac7a3bd1b6eb96d6d4ca56cb1043707b87370 -
Branch / Tag:
refs/tags/v0.1.6 - Owner: https://github.com/itsthelore
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@da1ac7a3bd1b6eb96d6d4ca56cb1043707b87370 -
Trigger Event:
push
-
Statement type:
File details
Details for the file wayfinder_router-0.1.6-py3-none-any.whl.
File metadata
- Download URL: wayfinder_router-0.1.6-py3-none-any.whl
- Upload date:
- Size: 56.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a3b1a12290cb3afcef8e9dc607c07f573526efc2937202e8f6902ef05658f31
|
|
| MD5 |
8241e055074769d1c71adf0e81b54d6f
|
|
| BLAKE2b-256 |
ff1c55eef7ea293b50b6b0094630943d975c72283815909f5eecf2f0c75f67b7
|
Provenance
The following attestation bundles were made for wayfinder_router-0.1.6-py3-none-any.whl:
Publisher:
release.yml on itsthelore/wayfinder-router
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wayfinder_router-0.1.6-py3-none-any.whl -
Subject digest:
3a3b1a12290cb3afcef8e9dc607c07f573526efc2937202e8f6902ef05658f31 - Sigstore transparency entry: 1867802201
- Sigstore integration time:
-
Permalink:
itsthelore/wayfinder-router@da1ac7a3bd1b6eb96d6d4ca56cb1043707b87370 -
Branch / Tag:
refs/tags/v0.1.6 - Owner: https://github.com/itsthelore
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@da1ac7a3bd1b6eb96d6d4ca56cb1043707b87370 -
Trigger Event:
push
-
Statement type: