Skip to main content

MCP server for LLM inference performance prediction — Traction Layer AI

Project description

Inference Predictor MCP Server

MCP server for Inference Predictor by Traction Layer AI — predict LLM inference performance (TTFT, throughput, cost) for any Hardware x Model x Runtime configuration, directly from Claude.

Installation

pip install inference-predictor-mcp

Claude Desktop Configuration

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "inference-predictor": {
      "command": "python",
      "args": ["-m", "inference_predictor_mcp"]
    }
  }
}

With a Pro API key (enables compare, optimize, batch-sweep, visualize, SKU listing):

{
  "mcpServers": {
    "inference-predictor": {
      "command": "python",
      "args": ["-m", "inference_predictor_mcp"],
      "env": {
        "KPG_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

Free Tier (no API key required)

Tool Description
predict_performance Predict TTFT, throughput, cost for a single hardware config
check_hardware_compatibility Check which GPUs can fit a model's weights
explain_model Generate educational architecture explainer
list_models List all 18 registered models with parameters
health_check Check API health and version

Pro Tier (API key required)

Tool Description
compare_configs Compare vLLM, SGLang, TensorRT-LLM side-by-side
find_optimal_hardware Search for cheapest/fastest hardware config
batch_size_sweep Sweep batch sizes to find optimal throughput
visualize_kpg Generate interactive Kernel Pipeline Graph
list_hardware_skus List AWS GPU instance SKUs with pricing

Get a Pro API Key

Visit predictor.tractionlayer.ai to obtain a Pro API key.

Environment Variables

Variable Default Description
KPG_API_BASE_URL Production API Gateway Override for self-hosted deployments
KPG_API_KEY (none) Pro tier API key
KPG_TIMEOUT 30 HTTP timeout in seconds

Development

If you're working on both this package and the main KPG_Predictor repo in the same virtualenv, you may hit a starlette version conflict between MCP SDK (requires >=1.0.0) and FastAPI (requires <0.49.0). Resolve by installing MCP first, then explicitly pinning starlette:

pip install -e mcp/
pip install "starlette<0.49.0,>=0.40.0"

End users who only pip install inference-predictor-mcp don't encounter this.

Documentation

Full Inference Predictor documentation: docs/cli.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inference_predictor_mcp-0.1.0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inference_predictor_mcp-0.1.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file inference_predictor_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: inference_predictor_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for inference_predictor_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 323da3b0d3c37618e4d14bef9dbd71f91a141759487a7710f499a63bcc3da359
MD5 e2012220b8ee01fb0b22f4fd092c8072
BLAKE2b-256 6df3473392c0c644d713b47a809ba222f1111236b82b19bba591f53dcb4f1b20

See more details on using hashes here.

File details

Details for the file inference_predictor_mcp-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for inference_predictor_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7436d889fabc20dcaaed64703d7fa2ef82a6ccd6dcfa95b86517139eaa99e7f8
MD5 8b47d4732d98f9c667850ff656fcf657
BLAKE2b-256 82405e866f8681d8def6dfbb0681e12f6851de22d1c552f3ad3403b737240861

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page