Skip to main content

MCP server for accessing Ollama models on a remote GPU server via Model Context Protocol

Project description

ollama-almasrv-mcp

MCP (Model Context Protocol) server for accessing Ollama models on a remote GPU server.

Designed for ALMASRV (NVIDIA RTX PRO 6000 96GB VRAM) but works with any Ollama instance behind a compatible HTTP gateway.

Features

  • 6 MCP tools: chat, think, embed, similarity, models, health
  • 10 models: 4 local (llama3.3:70b, qwen3:32b, llama3.2:3b, mxbai-embed-large) + 6 cloud
  • 1024-dim embeddings compatible with SQL Server 2025 VECTOR(1024)
  • Thinking models with reasoning traces (qwen3:32b, kimi-k2-thinking, glm-5, kimi-k2.5, minimax-m2.5)
  • Configurable endpoints via environment variables

Quick Install

pip install ollama-almasrv-mcp

Setup with Claude Code

# Add to Claude Code (user-level, available everywhere)
claude mcp add --scope user ollama-almasrv -- ollama-almasrv-mcp

# Or with custom server URLs
claude mcp add --scope user \
  -e ALMASRV_GATEWAY_URL=http://your-server:8030 \
  -e ALMASRV_EMBED_URL=http://your-server:8031 \
  ollama-almasrv -- ollama-almasrv-mcp

Available Tools

Tool Description
ollama_chat Chat with any Ollama model (default: qwen3:32b)
ollama_think Chat with thinking models that return reasoning traces
ollama_embed Generate 1024-dim embedding vectors
ollama_similarity Calculate cosine similarity between two texts
ollama_models List all available models
ollama_health Check gateway and embedding service health

Configuration

Environment Variable Default Description
ALMASRV_GATEWAY_URL http://192.168.50.78:8030 Ollama gateway (chat/think/models)
ALMASRV_EMBED_URL http://192.168.50.78:8031 Embedding service (embed/similarity)

Requirements

  • Python >= 3.10
  • A running Ollama instance with a compatible HTTP gateway
  • Gateway endpoints: /chat, /models, /health
  • Embedding endpoints: /embeddings, /similarity, /health

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_almasrv_mcp-1.0.0.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollama_almasrv_mcp-1.0.0-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file ollama_almasrv_mcp-1.0.0.tar.gz.

File metadata

  • Download URL: ollama_almasrv_mcp-1.0.0.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for ollama_almasrv_mcp-1.0.0.tar.gz
Algorithm Hash digest
SHA256 168b61215897859da2a929f2a8e6cc549e7b69a93d76adf43be7e79589f9cce9
MD5 c97fcea6f1e5046ab4557530702b4ce9
BLAKE2b-256 8f48a6ef410615fb354af521fed60cbcb49152abb5ca7ea929859519f27af955

See more details on using hashes here.

File details

Details for the file ollama_almasrv_mcp-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ollama_almasrv_mcp-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 01ec119c6ceac61162c1739159b23432f00447834fcdf53a17dec5018c1dec08
MD5 fdec2f62257a363f2c0f0aeefa75c502
BLAKE2b-256 4bb0258671a6e90b5794eca49bec95f4ffa48f19d9d455c22aaae33e93b9c897

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page