MCP server for LLM inference performance prediction — Traction Layer AI

These details have not been verified by PyPI

Project links

Project description

Inference Predictor MCP Server

MCP server for Inference Predictor by Traction Layer AI — predict LLM inference performance (TTFT, throughput, cost) for any Hardware x Model x Runtime configuration, directly from Claude.

Installation

pip install inference-predictor-mcp

Claude Desktop Configuration

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "inference-predictor": {
      "command": "python",
      "args": ["-m", "inference_predictor_mcp"]
    }
  }
}

With a Pro API key (enables compare, optimize, batch-sweep, visualize, SKU listing):

{
  "mcpServers": {
    "inference-predictor": {
      "command": "python",
      "args": ["-m", "inference_predictor_mcp"],
      "env": {
        "KPG_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

Free Tier (no API key required)

Tool	Description
`predict_performance`	Predict TTFT, throughput, cost for a single hardware config
`check_hardware_compatibility`	Check which GPUs can fit a model's weights
`explain_model`	Generate educational architecture explainer
`list_models`	List all 18 registered models with parameters
`health_check`	Check API health and version

Pro Tier (API key required)

Tool	Description
`compare_configs`	Compare vLLM, SGLang, TensorRT-LLM side-by-side
`find_optimal_hardware`	Search for cheapest/fastest hardware config
`batch_size_sweep`	Sweep batch sizes to find optimal throughput
`visualize_kpg`	Generate interactive Kernel Pipeline Graph
`list_hardware_skus`	List AWS GPU instance SKUs with pricing

Get a Pro API Key

Visit predictor.tractionlayer.ai to obtain a Pro API key.

Environment Variables

Variable	Default	Description
`KPG_API_BASE_URL`	Production API Gateway	Override for self-hosted deployments
`KPG_API_KEY`	(none)	Pro tier API key
`KPG_TIMEOUT`	30	HTTP timeout in seconds

Development

If you're working on both this package and the main KPG_Predictor repo in the same virtualenv, you may hit a starlette version conflict between MCP SDK (requires >=1.0.0) and FastAPI (requires <0.49.0). Resolve by installing MCP first, then explicitly pinning starlette:

pip install -e mcp/
pip install "starlette<0.49.0,>=0.40.0"

End users who only pip install inference-predictor-mcp don't encounter this.

Documentation

Full Inference Predictor documentation: docs/cli.md

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inference_predictor_mcp-0.1.0.tar.gz (6.7 kB view details)

Uploaded May 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

inference_predictor_mcp-0.1.0-py3-none-any.whl (7.5 kB view details)

Uploaded May 15, 2026 Python 3

File details

Details for the file inference_predictor_mcp-0.1.0.tar.gz.

File metadata

Download URL: inference_predictor_mcp-0.1.0.tar.gz
Upload date: May 15, 2026
Size: 6.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for inference_predictor_mcp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`323da3b0d3c37618e4d14bef9dbd71f91a141759487a7710f499a63bcc3da359`
MD5	`e2012220b8ee01fb0b22f4fd092c8072`
BLAKE2b-256	`6df3473392c0c644d713b47a809ba222f1111236b82b19bba591f53dcb4f1b20`

See more details on using hashes here.

File details

Details for the file inference_predictor_mcp-0.1.0-py3-none-any.whl.

File metadata

Download URL: inference_predictor_mcp-0.1.0-py3-none-any.whl
Upload date: May 15, 2026
Size: 7.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for inference_predictor_mcp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7436d889fabc20dcaaed64703d7fa2ef82a6ccd6dcfa95b86517139eaa99e7f8`
MD5	`8b47d4732d98f9c667850ff656fcf657`
BLAKE2b-256	`82405e866f8681d8def6dfbb0681e12f6851de22d1c552f3ad3403b737240861`

See more details on using hashes here.

inference-predictor-mcp 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Inference Predictor MCP Server

Installation

Claude Desktop Configuration

Available Tools

Free Tier (no API key required)

Pro Tier (API key required)

Get a Pro API Key

Environment Variables

Development

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes