Automatically find cheaper LLM alternatives while maintaining performance

These details have not been verified by PyPI

Project links

Project description

LLMux-Optimizer

Automatically find cheaper LLM alternatives while maintaining performance.

Quick Start

import llmux

# Find the cheapest model that maintains your accuracy requirements
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="your_data.jsonl",
    min_accuracy=0.9
)

print(f"Best model: {result['model']}")
print(f"Cost savings: {result['cost_savings']:.1%}")
print(f"Accuracy: {result['accuracy']:.1%}")

Installation

pip install llmux-optimizer

Why LLMux?

One-liner optimization - Just specify baseline and dataset
Real cost savings - Average 73% reduction in LLM costs
Multiple providers - Tests 18+ models across OpenAI, Anthropic, Google, Meta, Mistral, and more
Smart stopping - Skips smaller models when larger ones fail (saves API calls)
Production ready - Used by companies processing millions of requests

Features

Simple API

# Basic usage
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="data.jsonl"
)

# With custom parameters
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="data.jsonl",
    prompt="Classify the sentiment as positive, negative, or neutral",
    task="classification",
    min_accuracy=0.85,
    sample_size=0.2  # Test on 20% of data for speed
)

Supported Tasks

Classification - Sentiment analysis, intent detection, categorization
Extraction - Named entity recognition, information extraction
Generation - Text completion, summarization, translation
Binary - Yes/no, true/false decisions

Model Universe

Tests models from a curated universe including:

OpenAI (GPT-4, GPT-3.5)
Anthropic (Claude 3 Haiku, Sonnet)
Google (Gemini Pro, Flash)
Meta (Llama 3.1 8B, 70B)
Mistral (7B, Mixtral, Large)
And more...

Examples

Classification Task

import llmux

# Sentiment analysis
examples = [
    {"input": "This product is amazing!", "ground_truth": "positive"},
    {"input": "Terrible service", "ground_truth": "negative"},
    {"input": "It's okay", "ground_truth": "neutral"}
]

result = llmux.optimize_cost(
    baseline="gpt-4",
    examples=examples,
    task="classification",
    options=["positive", "negative", "neutral"]
)

Banking Intent Classification

# Prepare dataset (one-time)
from prepare_banking77 import prepare_banking77_dataset
prepare_banking77_dataset()

# Find optimal model
result = llmux.optimize_cost(
    baseline="gpt-4",
    dataset="data/banking77_test.jsonl",
    prompt="Classify the banking customer query into one of 77 intent categories",
    task="classification",
    min_accuracy=0.8
)

Cost Comparison

Typical savings on standard benchmarks:

Dataset	Baseline	Best Alternative	Cost Savings	Accuracy
IMDB	GPT-4	Llama-3.1-8B	96.3%	95.2%
AG News	GPT-4	Mistral-7B	94.7%	93.8%
Banking77	GPT-4	GPT-3.5-turbo	89.2%	91.4%

Advanced Usage

Custom Evaluation

from llmux import Evaluator, Provider

# Use specific provider
provider = Provider.get_provider("openrouter", model="meta-llama/llama-3.1-8b")
evaluator = Evaluator(provider)

# Run evaluation
accuracy, results = evaluator.evaluate(
    dataset="test_data.jsonl",
    system_prompt="You are a helpful assistant"
)

Smart Stopping

LLMux automatically implements smart stopping - if a larger model in a family (e.g., Llama-70B) fails to meet accuracy requirements, smaller models (Llama-8B) are skipped to save API calls.

Dataset Format

LLMux expects JSONL format with input and label fields:

{"input": "Example text", "label": "category"}
{"input": "Another example", "label": "other_category"}

Or use the examples parameter directly:

examples = [
    {"input": "text", "ground_truth": "label"},
    ...
]

API Reference

optimize_cost()

Main function to find the best cost-optimized model.

Parameters:

baseline (str): Reference model to beat (e.g., "gpt-4")
dataset (str): Path to JSONL dataset file
prompt (str, optional): System prompt for the task
task (str, optional): Task type ("classification", "extraction", "generation", "binary")
min_accuracy (float): Minimum acceptable accuracy (default: 0.9)
sample_size (float, optional): Percentage of dataset to use (0.0-1.0)
options (list, optional): Valid output options for classification
examples (list, optional): Direct examples instead of dataset file

Returns:

Dictionary with:
- model: Best model found
- accuracy: Achieved accuracy
- cost_savings: Percentage saved vs baseline
- cost_per_million: Cost per million tokens

Requirements

Python 3.8+
OpenRouter API key (set as OPENROUTER_API_KEY environment variable)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Citation

If you use LLMux in your research, please cite:

@software{llmux2024,
  title = {LLMux: Automatic LLM Cost Optimization},
  author = {Ahuja, Mihir},
  year = {2024},
  url = {https://github.com/mihirahuja/llmux}
}

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: your@email.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Aug 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmux_optimizer-0.1.0.tar.gz (21.7 kB view details)

Uploaded Aug 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmux_optimizer-0.1.0-py3-none-any.whl (19.3 kB view details)

Uploaded Aug 10, 2025 Python 3

File details

Details for the file llmux_optimizer-0.1.0.tar.gz.

File metadata

Download URL: llmux_optimizer-0.1.0.tar.gz
Upload date: Aug 10, 2025
Size: 21.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for llmux_optimizer-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7154ddb0f13700b57759361da736269615c53596a54bacd19302dd6c3c3d294a`
MD5	`ea3f023add21c191270ea99f55d9d95f`
BLAKE2b-256	`1df9af2940d39f723a83b71a7e920ac9d87db7eaa2d9986bf5ab927d9be6f450`

See more details on using hashes here.

File details

Details for the file llmux_optimizer-0.1.0-py3-none-any.whl.

File metadata

Download URL: llmux_optimizer-0.1.0-py3-none-any.whl
Upload date: Aug 10, 2025
Size: 19.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for llmux_optimizer-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7d8abfe7403ef001727a06ff49d49f4e1860f63a22fd6c32f549c952c2bfa960`
MD5	`996e92c257625b61890c4e9431c3f567`
BLAKE2b-256	`bb8b0dd8512b2bede7dc1a990bc56b9009d349f6576d090b42b03ce0f1f6c072`

See more details on using hashes here.

llmux-optimizer 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLMux-Optimizer

Quick Start

Installation

Why LLMux?

Features

Simple API

Supported Tasks

Model Universe

Examples

Classification Task

Banking Intent Classification

Cost Comparison

Advanced Usage

Custom Evaluation

Smart Stopping

Dataset Format

API Reference

optimize_cost()

Requirements

Contributing

License

Citation

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes