Skip to main content

A domain-agnostic adaptive query router for RAG systems. Routes queries using multi-signal retrieval score analysis, not vocabularies.

Project description

Lex-Router

An adaptive, domain-agnostic query router for RAG systems.

Most RAG systems run the same heavy retrieval pipeline for every query. Lex-Router fixes this by analyzing the statistical distribution of pilot retrieval scores to dynamically route queries to the optimal strategy. It works across any domain (legal, medical, financial, etc.) because routing decisions come from the retrieval landscape itself, not hardcoded vocabularies.

Built on the multi-signal pilot framework from the lexLegal RAG Pipeline.


Installation

pip install lex-router

Zero dependencies. Written entirely using the Python Standard Library (math, re, dataclasses).


Core Features

  • Domain Agnostic: Doesn't rely on keyword lists or LLM calls. Routes based on score variance, entropy, and statistical confidence.
  • Score Normalization: Built-in support for multiple embedding score types (COSINE, L2, DOT_PRODUCT, LOGITS, COSINE_DISTANCE, RAW).
  • Auto-Calibration: Automatically tunes internal thresholds to your specific embedding model to prevent "threshold drift".
  • Architecture Aware: Gracefully handles Hybrid (Dense+Sparse), Dense-Only (e.g., Pinecone), and Sparse-Only (e.g., Elasticsearch) systems.
  • Ultra-Low Latency: Sub-millisecond routing overhead, plus a configurable fast-path bypass for trivial queries.
  • Production Safe: Robust sanitization against NaN/Inf inputs, mathematical overflows, and database timeouts.

Use Cases

Lex-Router is designed to handle edge cases across diverse RAG architectures:

  1. Complex Domain Search (Legal/Medical): If a user asks about "the MAE clause regarding pandemic events", the router detects high term rarity and high score variance, routing it to a broad_expand strategy (deep vector search + graph traversal).
  2. Ultra-Low Latency Voice Assistants: If a user asks "What is this?", the router hits the fast_path_max_tokens bypass, skipping the database pilot search entirely and routing instantly to save latency.
  3. Legacy Enterprise Search: If your company only uses a keyword-based Elasticsearch database, Lex-Router's is_sparse_only mode will dynamically disable dense variance checks and route based purely on BM25 score margins and entropy.
  4. Machine Learning Transition: If you want to train a custom ML model to route queries, use the log_file parameter. The router will log a rich 11-signal feature vector for every query, giving you perfect training data for an XGBoost or Random Forest model later.

How It Works

The router runs a fast pilot search (e.g., top 10 results) through your retrieval backend and extracts 11 statistical signals:

Signal What It Measures
max_dense, bm25_max Maximum retrieval strength for vector and keyword searches
mean_dense, std_dense Vector score distribution and variance
top1_top5_margin Confidence gap between the top result and the rest
dense_bm25_overlap Agreement between dense and sparse retrievers
entropy Score distribution chaos
unique_doc_count Document diversity
query_rarity, max_term_rarity IDF-based term specificity

These signals map to 4 retrieval strategies. The router returns a RouteDecision object containing a suggested pool size and boolean feature flags that your downstream RAG pipeline can choose to implement:

Route When It Triggers Suggested Pool Recommended Features (Boolean Flags)
narrow_precise Rare anchor term or very confident top-1 50 use_bm25=True (Skip HyDE/Graph)
normal Balanced signals (Standard behavior) 100 use_hyde=True, use_bm25=True, use_xref=True
broad_expand Generic terms, high variance, many docs 200 All features enabled
reject_or_fallback Both retrievers returned garbage scores 0 fallback_enabled=True

Note: Lex-Router has zero dependencies. It does not execute HyDE or Graph traversals itself. It simply analyzes the pilot scores and returns a configuration object with boolean flags (e.g., decision.use_hyde = True). Your application logic decides what to do with those flags.


Quick Start

Option A: Connect Your Pipeline via Adapter

Implement the RetrievalAdapter interface. You only need to override the search methods your system supports.

from lex_router import AdaptiveRouter, RetrievalAdapter

class MyVectorAdapter(RetrievalAdapter):
    def __init__(self, vector_db):
        self.db = vector_db

    # Only implement dense_search if you have a dense-only system!
    def dense_search(self, query, k=10):
        results = self.db.search(query, top_k=k)
        scores = [r.score for r in results]
        doc_ids = [r.id for r in results]
        return scores, doc_ids

# Plug it in
router = AdaptiveRouter(adapter=MyVectorAdapter(my_db))

# Route the query
decision = router.route("What are the termination provisions?")
print(decision.route)  # e.g., 'normal'
print(decision.pool)   # e.g., 100

Option B: Pass Raw Scores (No Adapter)

If you already have retrieval scores from your pipeline, pass them directly:

from lex_router import AdaptiveRouter, ScoreType

router = AdaptiveRouter(
    dense_score_type=ScoreType.COSINE,
    sparse_score_type=ScoreType.RAW
)

decision = router.route_from_scores(
    dense_scores=[0.92, 0.87, 0.61, 0.45, 0.32],
    sparse_scores=[12.4, 8.1, 5.3],
    dense_docs=["doc_A", "doc_A", "doc_B"],
    sparse_docs=["doc_A", "doc_C", "doc_B"],
    query="Does the MAE clause exclude pandemic events?",
)

print(decision.route)               # 'narrow_precise'
print(decision.metadata['reason'])  # 'confident_top1'

Advanced Usage

Handling Different Embedders (ScoreType)

Different vector databases and embedding models return scores on completely different mathematical scales. Lex-Router normalizes these to a standard [0, 1] scale internally.

When initializing the router, declare your score types:

from lex_router import ScoreType

router = AdaptiveRouter(
    dense_score_type=ScoreType.L2,              # Converts [0, inf] -> [1, 0]
    sparse_score_type=ScoreType.RAW             # Leaves BM25 scores unbounded
)

Supported types: COSINE, COSINE_DISTANCE, L2, DOT_PRODUCT, LOGITS (for cross-encoders), RAW, and CUSTOM.

Auto-Calibrating Thresholds

Default routing thresholds (t_high=0.75, t_low=0.35) are optimized for specific models (like BGE-M3). If you use a different embedder (like OpenAI text-embedding-3), your score distribution will shift. Lex-Router can calibrate itself perfectly to your model:

# Provide a baseline sample of typical retrieval scores (e.g., from 50 queries)
router = AdaptiveRouter.auto_calibrate(
    baseline_dense_scores=[
        [0.82, 0.71, 0.60], 
        [0.44, 0.42, 0.41], 
        # ... more score batches
    ],
    dense_score_type=ScoreType.COSINE
)
# The router automatically calculates optimal percentiles for thresholding.

Ultra-Low Latency Bypass

For real-time voice applications or chat, you can bypass pilot retrieval entirely for very short queries:

from lex_router import RouterConfig

# Any query <= 3 words skips pilot search and goes straight to narrow_precise
cfg = RouterConfig(fast_path_max_tokens=3)
router = AdaptiveRouter(config=cfg, adapter=my_adapter)

Post-Reranking Fallback

After your cross-encoder reranks the results, check if confidence is still too low:

# 1. Get the initial route decision
decision = router.route("some query")

# 2. Run your retrieval pipeline using decision.pool...
# 3. Rerank the retrieved results using a Cross-Encoder...

# 4. Check if the top reranker score is suspiciously low, or if the original 
#    pilot signals had high entropy and low dense/sparse overlap.
signals = decision.metadata.get('signals')
if router.should_fallback(top_reranker_score=0.21, signals=signals):
    fallback = router.get_fallback_config(current_pool=decision.pool)
    # Re-run your retrieval pipeline with fallback.pool (doubled) and all features enabled.

Custom Routing Logic & Thresholds

If you want to use Lex-Router's signal extraction but need entirely different routing logic, you can easily extend it.

1. Custom Thresholds: Pass a RouterConfig to change the default cutoffs.

from lex_router import RouterConfig, AdaptiveRouter

# Require extremely high confidence for the narrow route
cfg = RouterConfig(t_high=0.90, margin_high=0.25)
router = AdaptiveRouter(config=cfg)

2. Custom Routes (Subclassing): Override the _classify method to define your own routing architecture. You get the robust 11-signal vector for free:

from lex_router import AdaptiveRouter, RouteDecision

class MyCustomRouter(AdaptiveRouter):
    def _classify(self, sig):
        # Create a brand new route based on the signals
        if sig.max_dense > 0.95 and sig.unique_doc_count == 1:
            return RouteDecision(route='hyper_precise', pool=10)
        
        # Or fall back to the default logic
        return super()._classify(sig)

Safe Logging for ML Training

Lex-Router is designed to transition from heuristics to machine learning. You can log every decision, along with its 11-feature signal vector, to train an ML model later. Use the router as a context manager to ensure safe resource handling:

with AdaptiveRouter(adapter=my_adapter, log_file="routing_decisions.jsonl") as router:
    decision = router.route("query one")
    decision = router.route("query two")
    
    router.print_summary() # Prints route distribution percentages

API & Parameters Reference

AdaptiveRouter Methods & Parameters

  • Parameters:
    • adapter (RetrievalAdapter): Your custom DB connection.
    • config (RouterConfig): Tuning parameters and thresholds.
    • pilot_k (int): Number of pilot results to retrieve. Default is 10.
    • dense_score_type / sparse_score_type (ScoreType): Math scale of scores (ScoreType.COSINE, RAW, etc.).
    • custom_dense_normalizer / custom_sparse_normalizer (Callable): Pass your own function if using ScoreType.CUSTOM.
    • log_file (str): Path to output a JSONL log of all decisions.
  • Methods:
    • route(query): Auto-routes using the connected adapter.
    • route_from_scores(...): Routes manually using provided raw score lists.
    • auto_calibrate(...): Generates optimal thresholds from a baseline score sample.
    • get_signals(query): Returns the 11-signal vector without making a routing decision.
    • should_fallback(...) & get_fallback_config(...): Safety net functions post-reranking.
    • print_summary() & route_summary(): Outputs routing statistics.
    • simplify_query(query): Static utility method that strips filler words but retains proper nouns.
    • close(): Manually closes the log file (not needed if using the with context manager).

RouterConfig

Data class containing all thresholds and boundaries. Pass this to AdaptiveRouter(config=...) to override defaults.

  • Confidence Thresholds: t_high, t_low, margin_high, std_high.
  • Diversity Thresholds: diversity_high (threshold for unique_doc_count).
  • Fallback Thresholds: reranker_threshold, entropy_threshold, overlap_threshold.
  • Rarity Thresholds: rarity_anchor, rarity_generic, rarity_reject.
  • Pools & Top-K: narrow_pool, narrow_k, normal_pool, normal_k, broad_pool, broad_k, fallback_max_pool, fallback_k.
  • fast_path_max_tokens: If > 0, queries with length <= this value bypass the DB entirely for speed.
  • min_pilot_results: Degrades to normal route if fewer pilot results are returned.

RouteDecision

The output of router.route().

  • route (str): The strategy chosen (e.g., 'narrow_precise').
  • pool / top_k (int): Suggested retrieval depths.
  • use_hyde / use_bm25 / use_xref / use_span_refine (bool): Suggested feature toggles. These are conceptual indicators for your downstream pipeline. use_hyde suggests Query Expansion, use_bm25 suggests Keyword Search, and use_xref suggests Graph/Multi-hop routing. Map these to your pipeline's equivalent features.
  • xref_hops (int): Suggested depth for graph/cross-reference traversal (e.g., 1 for normal, 2 for fallback).
  • fallback_enabled (bool): Boolean indicating if fallback logic is permitted.
  • metadata (dict): Contains the exact reason for the route, the raw 11-signal vector (signals), and the error string if an adapter crashed.

compute_signals()

Standalone function to extract the PilotSignals vector directly from raw score arrays if you don't want to use the full AdaptiveRouter class.


License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lex_router-0.1.1.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lex_router-0.1.1-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file lex_router-0.1.1.tar.gz.

File metadata

  • Download URL: lex_router-0.1.1.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for lex_router-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8b15874209c0773ff26680d523650620466557666d7cf273a2d73839fb572b58
MD5 4e9c41527069b4556f3febaab3cbca21
BLAKE2b-256 4d16e8f2113a3cab11cfc5e91be415ad47ff31ac767ccb45b33aab1d4146691d

See more details on using hashes here.

File details

Details for the file lex_router-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: lex_router-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for lex_router-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 60fa2c8521eeac2138732047670b249d39721f9ebd8e5c8b11d567ef2dfd2fd6
MD5 fdf520fa62776d151e111829fda6f0d4
BLAKE2b-256 111fce30fdba3588d0bee48a94db2ce543817b089273385f0b2d24f621b73aba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page