Skip to main content

Haystack integration for WhichModel — cost-aware LLM model selection

Project description

whichmodel-haystack

PyPI Python

Haystack integration for WhichModel — cost-aware LLM model selection for your pipelines.

Installation

pip install whichmodel-haystack

Quick Start

from haystack_integrations.components.routers.whichmodel import WhichModelRouter

router = WhichModelRouter()
result = router.run(task_type="code_generation", complexity="high")

print(result["model_id"])       # e.g. "anthropic/claude-sonnet-4"
print(result["provider"])       # e.g. "anthropic"
print(result["confidence"])     # "high", "medium", or "low"

No API key required. The component calls the public WhichModel MCP server at https://whichmodel.dev/mcp.

Usage in a Pipeline

Use WhichModelRouter to dynamically select the best model before generating:

from haystack import Pipeline
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack_integrations.components.routers.whichmodel import WhichModelRouter

# Get the best model for the task
router = WhichModelRouter()
result = router.run(
    task_type="code_generation",
    complexity="high",
    estimated_input_tokens=2000,
    estimated_output_tokens=1000,
    budget_per_call=0.01,
    requirements={"tool_calling": True},
)

# Use the recommended model
print(f"Using {result['model_id']} (confidence: {result['confidence']})")
print(f"Estimated cost: ${result['recommendation']['cost_estimate_usd']:.6f}")

Parameters

Init Parameters

Parameter Type Default Description
mcp_endpoint str https://whichmodel.dev/mcp WhichModel MCP server URL
timeout float 30.0 HTTP request timeout in seconds
default_task_type str None Default task type for run()
default_complexity str "medium" Default complexity level

Run Parameters

Parameter Type Description
task_type str Task type: chat, code_generation, code_review, summarisation, translation, data_extraction, tool_calling, creative_writing, research, classification, embedding, vision, reasoning
complexity str "low", "medium", or "high"
estimated_input_tokens int Expected input size in tokens
estimated_output_tokens int Expected output size in tokens
budget_per_call float Max USD per call
requirements dict Capability requirements: tool_calling, json_output, streaming, context_window_min, providers_include, providers_exclude

Output

Key Type Description
model_id str Recommended model ID (e.g. anthropic/claude-sonnet-4)
provider str Provider name
recommendation dict Full recommendation with score, reasoning, pricing
alternative dict Alternative model from different provider/tier
budget_model dict Cheapest viable option
confidence str "high", "medium", or "low"
data_freshness str When pricing data was last updated

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whichmodel_haystack-0.1.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whichmodel_haystack-0.1.0-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file whichmodel_haystack-0.1.0.tar.gz.

File metadata

  • Download URL: whichmodel_haystack-0.1.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for whichmodel_haystack-0.1.0.tar.gz
Algorithm Hash digest
SHA256 463afff5435cb1cd8565cf260388f10615acd3080505375a50f30b43d1290a69
MD5 6b0248ecd31499f72bbb2fc4c268483e
BLAKE2b-256 1c314b50f1fc4a4a928ca234d2fa4c507321610dfdad6f8fcbea512c207e6400

See more details on using hashes here.

File details

Details for the file whichmodel_haystack-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for whichmodel_haystack-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4ccb7475454870ae1f61e0a7a814779c7679495ef0f9f73beb2db29ab451af91
MD5 8a9eed6101b177120cf9ff58e3927dd8
BLAKE2b-256 af8f0f577e46bc077bdccdfbe6d34d0096560cf1e4d10eba95a9a167cd17f0f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page