Skip to main content

Production-focused Python library for intelligent LLM routing and multi-provider management

Project description

JustLLMs

A production-ready Python library focused on intelligent LLM routing and multi-provider management.

PyPI version Downloads

Why JustLLMs?

Managing multiple LLM providers is complex. You need to handle different APIs, optimize costs, and ensure reliability. JustLLMs solves these challenges by providing a unified interface that automatically routes requests to the best provider based on your criteriaโ€”whether that's cost, speed, or quality. By default, JustLLMs uses intelligent cluster-based routing (beta) powered by machine learning to optimize for all three factors simultaneously.

Installation

pip install justllms

Package size: ~113KB | Lines of code: ~4.3K | Dependencies: Production-focused

Quick Start

from justllms import JustLLM

# Initialize with your API keys
client = JustLLM({
    "providers": {
        "openai": {"api_key": "your-openai-key"},
        "google": {"api_key": "your-google-key"},
        "anthropic": {"api_key": "your-anthropic-key"}
    }
})

# Simple completion - automatically routes to best provider
response = client.completion.create(
    messages=[{"role": "user", "content": "Explain quantum computing briefly"}]
)
print(response.content)

Core Features

Multi-Provider Support

Connect to all major LLM providers with a single, consistent interface:

  • OpenAI (GPT-5, GPT-4, etc.)
  • Google (Gemini 2.5, Gemini 1.5 models)
  • Anthropic (Claude 4, Claude 3.5 models)
  • Azure OpenAI (with deployment mapping)
  • xAI Grok, DeepSeek
  • Ollama (local Llama/Mistral/phi models hosted on your machine)
# Switch between providers seamlessly
client = JustLLM({
    "providers": {
        "openai": {"api_key": "your-key"},
        "google": {"api_key": "your-key"},
        "anthropic": {"api_key": "your-key"},
        "ollama": {"base_url": "http://localhost:11434"}
    }
})

# Same interface, different providers automatically chosen
response1 = client.completion.create(
    messages=[{"role": "user", "content": "Explain AI"}],
    provider="openai",  # Force specific provider
    model="gpt-5"
)

Ollama runs locally and requires no API key. Set OLLAMA_API_BASE (defaults to http://localhost:11434) and JustLLMs automatically discovers every installed model via the Ollama /api/tags endpoint.

Intelligent Routing

The game-changing feature that sets JustLLMs apart. Instead of manually choosing models, let our intelligent routing engine automatically select the optimal provider and model for each request based on your priorities.

Available Strategies

๐Ÿ†• Cluster-Based Routing (Beta) - AI-Powered Query Analysis Our most advanced routing strategy uses machine learning to analyze query semantics and route to the optimal model based on similarity to training data. Achieves +7% accuracy improvement and -27% cost reduction compared to single-model approaches.

# Cluster-based routing (recommended for production)
client = JustLLM({
    "providers": {...},
    "routing": {"strategy": "cluster"}
})

Based on research from Beyond GPT-5: Making LLMs Cheaper and Better via Performanceโ€“Efficiency Optimized Routing - AvengersPro framework

How Cluster Routing Works

  1. Query Analysis: Your request is embedded using Qwen3-Embedding-0.6B
  2. Cluster Matching: Finds the most similar cluster from pre-trained data
  3. Model Selection: Routes to the best-performing model for that cluster
  4. Fallback: Falls back to configured fallback provider/model or first available if cluster routing is unavailable

Result: Up to 60% cost reduction while improving accuracy, with automatic failover to backup providers.

Side-by-Side Model Comparison

Compare multiple LLM providers and models simultaneously with our interactive SXS (Side-by-Side) comparison tool. Perfect for evaluating model performance, testing prompts, and making informed decisions about which models to use.

Features

  • Interactive CLI: Select providers and models using checkbox interface
  • Parallel Execution: All models run simultaneously for fair comparison
  • Real-time Results: Live display with loading animation until all models complete
  • Comprehensive Metrics: Compare latency, token usage, response quality and costs across models
  • Multiple Providers: Test OpenAI, Google, Anthropic, xAI, DeepSeek models side-by-side

Usage

# Run the interactive SXS comparison
justllms sxs

The tool will guide you through:

  1. Provider Selection: Choose which LLM providers to compare
  2. Model Selection: Pick specific models from each provider
  3. Prompt Input: Enter your test prompt
  4. Real-time Comparison: View all responses and metrics simultaneously

Example Output

================================================================================
Prompt: Which programming language is better for beginners: Python or JavaScript?
================================================================================

โ”Œโ”€ openai/gpt-5          โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Python is generally better for beginners due to its clean, readable syntax โ”‚
โ”‚ that resembles natural language. It has fewer confusing concepts like       โ”‚
โ”‚ hoisting or prototypes, excellent learning resources, and is widely used    โ”‚
โ”‚ in education. Python's "batteries included" philosophy means beginners can  โ”‚
โ”‚ accomplish tasks without learning complex setups, making it ideal for       โ”‚
โ”‚ building confidence early in programming.                                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”Œโ”€ google/gemini-2.5-pro โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ JavaScript has advantages for beginners because it runs everywhere - in     โ”‚
โ”‚ browsers, servers, and mobile apps. You can see immediate visual results    โ”‚
โ”‚ when building web pages, which is motivating. The job market heavily favors โ”‚
โ”‚ JavaScript developers, and modern frameworks make it powerful. While syntax โ”‚
โ”‚ can be tricky, the instant feedback and versatility make JavaScript a       โ”‚
โ”‚ practical first language for aspiring developers.                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

================================================================================
Metrics Summary:

| Model                   |  Status   | Latency (s) | Tokens | Cost ($) |
|-------------------------|-----------|-------------|--------|----------|
| openai/gpt-5            | โœ“ Success |        5.69 |    715 |   0.0000 |
| google/gemini-2.5-pro   | โœ“ Success |       8.50 |    868 |   0.0003  |

๐Ÿ† Comparison with Alternatives

Feature JustLLMs LangChain LiteLLM OpenAI SDK
Package Size Minimal ~50MB ~5MB ~1MB
Setup Complexity Simple config Complex chains Medium Simple
Multi-Provider โœ… 7+ providers โœ… Many integrations โœ… 100+ providers โŒ OpenAI only
Intelligent Routing โœ… ML-powered cluster routing โŒ Manual only โš ๏ธ Basic routing โŒ None
Side-by-Side Comparison โœ… Interactive CLI tool โŒ None โŒ None โŒ None
Cost Optimization โœ… Automatic routing โŒ Manual optimization โš ๏ธ Basic cost tracking โŒ None
Production Ready โœ… Out of the box โš ๏ธ Requires setup โœ… Minimal setup โš ๏ธ Basic features

Provider-Specific Parameters

JustLLMs supports common generation parameters across all providers, plus provider-specific configurations:

Common Parameters (All Providers)

These parameters work across OpenAI, Gemini, Anthropic, and other providers:

response = client.completion.create(
    messages=[{"role": "user", "content": "Hello"}],
    # Common parameters
    temperature=0.7,        # 0.0-2.0: Controls randomness
    top_p=0.9,             # 0.0-1.0: Nucleus sampling
    top_k=40,              # Integer: Top-k sampling (Gemini only)
    max_tokens=1024,       # Maximum tokens to generate
    stop=["END"],          # Stop sequence(s)
    n=1,                   # Number of completions (OpenAI only)
    presence_penalty=0.1,  # -2.0 to 2.0: Penalize new topics
    frequency_penalty=0.2  # -2.0 to 2.0: Penalize repetition
)

Gemini-Specific Parameters

Use generation_config for Gemini-only features:

response = client.completion.create(
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    provider="google",
    model="gemini-2.5-flash",
    # Common parameters
    temperature=0.7,
    top_k=40,
    max_tokens=1024,
    # Gemini-specific configuration
    generation_config={
        "candidateCount": 2,                    # Generate multiple responses
        "responseMimeType": "application/json", # JSON output
        "responseSchema": {...},                # Structured output schema
        "thinkingConfig": {                     # Control thinking budget
            "thinkingBudget": 100               # 0-24000 tokens
        }
    }
)

# Access multiple candidates when candidateCount > 1
print(f"Candidate 1: {response.choices[0].message.content}")
print(f"Candidate 2: {response.choices[1].message.content}")

Notes:

  • Common parameters (temperature, top_k, etc.) should be set at the top level. The generation_config dict is for Gemini-exclusive features.
  • If a parameter is specified in both places, the top-level value takes precedence.
  • When candidateCount > 1, all candidates are returned in response.choices[] with proper indices.

OpenAI-Specific Parameters

OpenAI parameters are passed directly:

response = client.completion.create(
    messages=[{"role": "user", "content": "Hello"}],
    provider="openai",
    model="gpt-4o",
    # Common parameters
    temperature=0.7,
    max_tokens=100,
    n=1,
    presence_penalty=0.1,
    frequency_penalty=0.2
)

Note: top_k is not supported by OpenAI and will be silently ignored. Use generation_config only with Gemini.

Production Configuration

For production deployments:

production_config = {
    "providers": {
        "azure_openai": {
            "api_key": os.getenv("AZURE_OPENAI_KEY"),
            "endpoint": os.getenv("AZURE_OPENAI_ENDPOINT"),
            "resource_name": "my-enterprise-resource",
            "deployment_mapping": {
                "gpt-4": "my-gpt4-deployment",
                "gpt-3.5-turbo": "my-gpt35-deployment"
            }
        },
        "anthropic": {"api_key": os.getenv("ANTHROPIC_KEY")},
        "google": {"api_key": os.getenv("GOOGLE_KEY")},
        "ollama": {
            "base_url": os.getenv("OLLAMA_API_BASE", "http://localhost:11434"),
            "enabled": True,
        }
    },
    "routing": {
        "strategy": "cluster",  # Use intelligent cluster-based routing
        "fallback_provider": "azure_openai",
        "fallback_model": "gpt-3.5-turbo"
    }
}

client = JustLLM(production_config)

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

justllms-2.1.5.tar.gz (123.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

justllms-2.1.5-py3-none-any.whl (132.1 kB view details)

Uploaded Python 3

File details

Details for the file justllms-2.1.5.tar.gz.

File metadata

  • Download URL: justllms-2.1.5.tar.gz
  • Upload date:
  • Size: 123.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for justllms-2.1.5.tar.gz
Algorithm Hash digest
SHA256 b770fb95b9957d7a732370fff6f380a9fbf6b4df81d524f2c801683c0da9e792
MD5 8e49219d11c1d51d08cdabad0b7cc530
BLAKE2b-256 7d226d1bfcfef4bf639ba7323e24a71f519e1faab54e8c83aca9a6811edecf96

See more details on using hashes here.

File details

Details for the file justllms-2.1.5-py3-none-any.whl.

File metadata

  • Download URL: justllms-2.1.5-py3-none-any.whl
  • Upload date:
  • Size: 132.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for justllms-2.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7c6463ab71c91149a44c4d27567ebc1c76083a4bc970d45fa8cdf6a3a86764dd
MD5 1756c3545a51a7c32b72196bf9a53127
BLAKE2b-256 a28cb2765d9c3aca4c1c31ae0277312922f26503b4c19110f00cdce0bf6674fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page