Skip to main content

Automatically find cheaper LLM alternatives while maintaining performance

Project description

LLMux-Optimizer

PyPI version Python 3.8+ License: MIT Downloads

Automatically find cheaper LLM alternatives while maintaining performance.

Quick Start

import llmux

# Find the cheapest model that maintains your accuracy requirements
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="your_data.jsonl",
    min_accuracy=0.9
)

print(f"Best model: {result['model']}")
print(f"Cost savings: {result['cost_savings']:.1%}")
print(f"Accuracy: {result['accuracy']:.1%}")

Installation

pip install llmux-optimizer

Why LLMux?

  • One-liner optimization - Just specify baseline and dataset
  • Real cost savings - Average 73% reduction in LLM costs
  • Multiple providers - Tests 18+ models across OpenAI, Anthropic, Google, Meta, Mistral, and more
  • Smart stopping - Skips smaller models when larger ones fail (saves API calls)
  • Production ready - Used by companies processing millions of requests

Features

Simple API

# Basic usage
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="data.jsonl"
)

# With custom parameters
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="data.jsonl",
    prompt="Classify the sentiment as positive, negative, or neutral",
    task="classification",
    min_accuracy=0.85,
    sample_size=0.2  # Test on 20% of data for speed
)

Supported Tasks

  • Classification - Sentiment analysis, intent detection, categorization
  • Extraction - Named entity recognition, information extraction
  • Generation - Text completion, summarization, translation
  • Binary - Yes/no, true/false decisions

Model Universe

Tests models from a curated universe including:

  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic (Claude 3 Haiku, Sonnet)
  • Google (Gemini Pro, Flash)
  • Meta (Llama 3.1 8B, 70B)
  • Mistral (7B, Mixtral, Large)
  • And more...

Examples

Classification Task

import llmux

# Sentiment analysis
examples = [
    {"input": "This product is amazing!", "ground_truth": "positive"},
    {"input": "Terrible service", "ground_truth": "negative"},
    {"input": "It's okay", "ground_truth": "neutral"}
]

result = llmux.optimize_cost(
    baseline="gpt-4",
    examples=examples,
    task="classification",
    options=["positive", "negative", "neutral"]
)

Banking Intent Classification

# Prepare dataset (one-time)
from prepare_banking77 import prepare_banking77_dataset
prepare_banking77_dataset()

# Find optimal model
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="data/banking77_test.jsonl",
    prompt="Classify the banking customer query into one of 77 intent categories",
    task="classification",
    min_accuracy=0.8
)

Cost Comparison

Typical savings on standard benchmarks:

Dataset Baseline Best Alternative Cost Savings Accuracy
IMDB GPT-4 Llama-3.1-8B 96.3% 95.2%
AG News GPT-4 Mistral-7B 94.7% 93.8%
Banking77 GPT-4 GPT-3.5-turbo 89.2% 91.4%

Advanced Usage

Custom Evaluation

from llmux import Evaluator, Provider

# Use specific provider
provider = Provider.get_provider("openrouter", model="meta-llama/llama-3.1-8b")
evaluator = Evaluator(provider)

# Run evaluation
accuracy, results = evaluator.evaluate(
    dataset="test_data.jsonl",
    system_prompt="You are a helpful assistant"
)

Smart Stopping

LLMux automatically implements smart stopping - if a larger model in a family (e.g., Llama-70B) fails to meet accuracy requirements, smaller models (Llama-8B) are skipped to save API calls.

Dataset Format

LLMux expects JSONL format with input and label fields:

{"input": "Example text", "label": "category"}
{"input": "Another example", "label": "other_category"}

Or use the examples parameter directly:

examples = [
    {"input": "text", "ground_truth": "label"},
    ...
]

API Reference

optimize_cost()

Main function to find the best cost-optimized model.

Parameters:

  • baseline (str): Reference model to beat (e.g., "gpt-4")
  • dataset (str): Path to JSONL dataset file
  • prompt (str, optional): System prompt for the task
  • task (str, optional): Task type ("classification", "extraction", "generation", "binary")
  • min_accuracy (float): Minimum acceptable accuracy (default: 0.9)
  • sample_size (float, optional): Percentage of dataset to use (0.0-1.0)
  • options (list, optional): Valid output options for classification
  • examples (list, optional): Direct examples instead of dataset file

Returns:

  • Dictionary with:
    • model: Best model found
    • accuracy: Achieved accuracy
    • cost_savings: Percentage saved vs baseline
    • cost_per_million: Cost per million tokens

Requirements

  • Python 3.8+
  • OpenRouter API key (set as OPENROUTER_API_KEY environment variable)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Citation

If you use LLMux in your research, please cite:

@software{llmux2024,
  title = {LLMux: Automatic LLM Cost Optimization},
  author = {Ahuja, Mihir},
  year = {2024},
  url = {https://github.com/mihirahuja/llmux}
}

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmux_optimizer-0.1.0.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmux_optimizer-0.1.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file llmux_optimizer-0.1.0.tar.gz.

File metadata

  • Download URL: llmux_optimizer-0.1.0.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for llmux_optimizer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7154ddb0f13700b57759361da736269615c53596a54bacd19302dd6c3c3d294a
MD5 ea3f023add21c191270ea99f55d9d95f
BLAKE2b-256 1df9af2940d39f723a83b71a7e920ac9d87db7eaa2d9986bf5ab927d9be6f450

See more details on using hashes here.

File details

Details for the file llmux_optimizer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llmux_optimizer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7d8abfe7403ef001727a06ff49d49f4e1860f63a22fd6c32f549c952c2bfa960
MD5 996e92c257625b61890c4e9431c3f567
BLAKE2b-256 bb8b0dd8512b2bede7dc1a990bc56b9009d349f6576d090b42b03ce0f1f6c072

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page