Skip to main content

Quick LLM routing using Embeddings

Project description

LLM Router

A fast semantic router using Embeddings.

LLM Router lets you define "routes" - sets of sentences or keywords with similar semantics. You can then use these routes to route a user's input to the appropriate LLM (or other model) for a response, leading to faster inference and better results. It's a faster and more efficient alternative to using a single LLM for all responses or letting a single LLM choose function calls.

It can work entirely locally using sentence_transformers or using OpenAI's API.

Important: results WILL vary between openai and the sbert/sentence-transformer model you pick. You'll need to experiment to find the best model for your use case, as well as adjusting the threshold for your needs.

But why?

This code was created as part of Clio AI (my most recent startup) - the idea then was to allow selecting the best "agent" to handle a certain conversation based on the semantics of what the user is saying, which allowed us to use "lesser" models like GPT-3.5 instead of GPT-4 to respond with the appropriate set of system prompt, function calls, etc. In some cases, it allows you to skip LLMs altogether, saving time and $$$.

Usage

Install with:

pip install llm-router

To use the OpenAI API, you'll need an API Key and the openai pip package (pip install openai).

To use a Sentence Transformers model, you'll need the sentence_transformers pip package (pip install sentence_transformers). No API key is required, model weights will be downloaded from huggingface on first use.

By default, the Router will always match one of the routes. If you'd like to allow the user to say something that doesn't match any of the routes, you can set the threshold value when initializing the Router. This will return None if no routes match with a certain percentage (0 to 1.0).

Define routes in code and use the router like so:

from llm_router import Router, Route
from llm_router.chroma import SentenceTransformer

router = Router(
    engine=SentenceTransformersEngine(
        model_name='all-distilroberta-v1',
        threshold=0.3,
    ),
    routes=[
        Route(
            name='upscale',
            sentences=[
                'upscale the image',
                'increase resolution',
                'I want a larger image',
                'increase the pixel count',
                'increase the size of the image',
                'increase the resolution of the image',
                'I think this is too small',
            ]
        ),
        Route(
            name='edit',
            sentences=[
                'edit image',
                'rotate image',
                'flip image',
                'resize image',
                'adjust contrast',
                'adjust saturation',
                'adjust brightness',
                'change colors',
                'color balance',
                'change image format',
                'change dimensions',
                'change size',
                'crop image',
            ]
        )
    ]
)


user_message = 'I want to increase the resolution of the image, is that possible?'

if router.match(user_message) == 'upscale':
    handle_upscale()
elif router.match(user_message) == 'edit':
    handle_edit()
else:
    print('Sorry, I don\'t understand.')

References

  • Pretrained Sentence Transformer models - all-mpnet-base-v2 is recommended for most use casesm but you can also use all-distilroberta-v1 for a smaller model that's faster. I haven't tested other models yet.

Future enhancements (PRs welcome!)

  • Support for other embedding APIs (cohere, vertexai, etc)
  • Storing weights to prevent reprocessing every time
  • Add a finetuning script to create routes from a dataset and refine sentences

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm-router-0.1.1.tar.gz (5.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page