Model Context Protocol server that proxies local Ollama to MCP clients like Windsurf and VS Code
Project description
๐ฆ Ollama MCP Server (Python)
Supercharge your AI assistant with local LLM access
A Python implementation of the MCP (Model Context Protocol) server that exposes Ollama SDK functionality as MCP tools, enabling seamless integration between your local LLM models and MCP-compatible applications like Windsurf and VS Code.
This is a Python port of the TypeScript ollama-mcp project.
Features โข Installation โข Available Tools โข Configuration โข Windsurf Integration โข Usage โข Development
Example of usage
Type in the chat window:
-
MCP Tool: ollama / ollama_chat. Use model llava and tell me a bed time story
-
MCP Tool: ollama / ollama_chat. Use model gpt-oss and tell me a bed time story
โจ Features
- โ๏ธ Ollama Cloud Support - Full integration with Ollama's cloud platform
- ๐ง 8 Comprehensive Tools - Full access to Ollama's SDK functionality
- ๐ Hot-Swap Architecture - Automatic tool discovery with zero-config
- ๐ฏ Type-Safe - Built with Pydantic models and type hints
- ๐ High Test Coverage - Comprehensive test suite (planned)
- ๐ Minimal Dependencies - Lightweight and fast
- ๐ Drop-in Integration - Works with Windsurf, VS Code, and other MCP clients
- ๐ Web Search & Fetch - Real-time web search and content extraction via Ollama Cloud (planned)
- ๐ Hybrid Mode - Use local and cloud models seamlessly in one server
๐ก Why Python?
This Python implementation provides the same functionality as the TypeScript version but with:
- Python Native: No Node.js dependencies required
- Poetry Package Management: Modern Python dependency management
- Async/Await: Native Python async support
- Pydantic Models: Robust data validation and type safety
- Poetry Scripts: Easy installation and execution
๐ฆ Installation
Prerequisites
- Python 3.10+
- Poetry (for development)
- Ollama running locally
Quick Install with Poetry
# Clone the repository
git clone <repository-url>
cd mcp-ollama-python
# Install dependencies
py -m poetry install
# Run the server, run only if you wish to test using scripts, otherwise integration with Windsurf or VS Code will take care of it.
py -m poetry run mcp-ollama-python
Manual Installation
# Install Poetry if you don't have it
curl -sSL https://install.python-poetry.org | python3 -
# Clone and install
git clone <repository-url>
cd mcp-ollama-python
poetry install
# Run the server, run only if you wish to test using scripts, otherwise integration with Windsurf or VS Code will take care of it.
poetry run mcp-ollama-python
๐ ๏ธ Generate a Windows executable if you specifically need it; otherwise, this step can be skipped.
poetry run pyinstaller mcp-ollama-python.spec --clean --distpath bin
๐ ๏ธ Available Tools
Model Management
| Tool | Description |
|---|---|
ollama_list |
List all available local models |
ollama_show |
Get detailed information about a specific model |
ollama_pull |
Download models from Ollama library |
ollama_delete |
Remove models from local storage |
Model Operations
| Tool | Description |
|---|---|
ollama_ps |
List currently running models |
ollama_generate |
Generate text completions |
ollama_chat |
Interactive chat with models (supports tools/functions) |
ollama_embed |
Generate embeddings for text |
Web Tools (Ollama Cloud - Planned)
| Tool | Description |
|---|---|
ollama_web_search |
Search the web with customizable result limits |
ollama_web_fetch |
Fetch and parse web page content |
โ๏ธ Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
OLLAMA_HOST |
http://127.0.0.1:11434 |
Ollama server endpoint |
OLLAMA_API_KEY |
- | API key for Ollama Cloud (when implemented) |
Custom Ollama Host
export OLLAMA_HOST="http://localhost:11434"
py -m poetry run mcp-ollama-python
Ollama Cloud Configuration (Planned)
export OLLAMA_HOST="https://ollama.com"
export OLLAMA_API_KEY="your-ollama-cloud-api-key"
py -m poetry run mcp-ollama-python
MCP Model Configuration
The server exposes local Ollama models through MCP. Configure available models in mcp.json:
mcp-ollama-python/mcp.json
{
"capabilities": {
"models": [
{
"name": "gpt-oss",
"provider": "ollama",
"description": "Local Ollama GPT-OSS model served through MCP",
"maxTokens": 4096
}
]
}
}
Model Configuration Options:
name: Model identifier used by MCP clientsprovider: Always "ollama" for this serverdescription: Human-readable model descriptionmaxTokens: Maximum context window size
You can add multiple models to expose different Ollama models through MCP:
{
"capabilities": {
"models": [
{
"name": "gpt-oss",
"provider": "ollama",
"description": "Local Ollama GPT-OSS model",
"maxTokens": 4096
},
{
"name": "llama3.2",
"provider": "ollama",
"description": "Llama 3.2 model for general tasks",
"maxTokens": 8192
},
{
"name": "codellama",
"provider": "ollama",
"description": "Code Llama for programming tasks",
"maxTokens": 16384
}
]
}
}
๐ Windsurf Integration
Windsurf is an AI-powered code editor that supports MCP servers. This section provides complete setup instructions for integrating the Ollama MCP server with Windsurf.
Step 1: Configure MCP Server
Add the Ollama MCP server to your Windsurf MCP configuration:
%USERPROFILE%\.codeium\windsurf\mcp_config.json (Windows)
~/.codeium/windsurf/mcp_config.json (macOS/Linux)
{
"mcpServers": {
"ollama": {
"command": "py",
"args": ["-m", "mcp_ollama_python"],
"disabled": false,
"env": {}
},
"git": {
"command": "py",
"args": ["-m", "mcp_server_git"],
"disabled": true,
"env": {}
}
}
}
Windsurf Tools setup file: ** .windsurf\workflows\tools.md
---
description: Quick reference for Windsurf MCP tools (mcp-ollama)
auto_execution_mode: 2
---
# MCP Tools (mcp-ollama)
Available tools exposed by the local `mcp-ollama-python` server:
- **ollama_chat** โ Interactive chat with models (multi-turn, tool-calling, structured outputs)
- **ollama_list** โ List installed models
- **ollama_show** โ Show details for a specific model
- **ollama_generate** โ Single-prompt text generation
- **ollama_pull** โ Pull a model from a registry
- **ollama_delete** โ Delete a local model
- **ollama_ps** โ List running models
- **ollama_embed** โ Create embeddings for input text
- **ollama_execute** โ Execute a system command via the server (utility/test)
## How to list tools in Windsurf
1) Open the command palette and run: `MCP: List Tools`
2) Or run the MCP tool via the chat with: `/tools`
## Notes
- Server: local Ollama via `mcp-ollama-python`
- Formats: most tools accept `format` = `json` (default) or `markdown`
Configuration Options:
command: Python interpreter command (py,python, orpython3)args: Module execution argumentsdisabled: Set tofalseto enable the serverenv: Environment variables (e.g.,OLLAMA_HOST)
Alternative Configuration (with Poetry):
{
"mcpServers": {
"ollama": {
"command": "py",
"args": ["-m", "poetry", "run", "mcp-ollama-python"],
"cwd": "d:/path/to/mcp-ollama-python",
"disabled": false,
"env": {}
}
}
}
Step 2: Configure Default Model Behavior
Set Windsurf to prefer your local MCP server over cloud models:
%USERPROFILE%\.codeium\windsurf\settings.json (Windows)
~/.codeium/windsurf/settings.json (macOS/Linux)
{
"defaultModelBehavior": "prefer-mcp",
"preferredMcpModel": {
"server": "ollama",
"model": "gpt-oss"
}
}
Settings Explanation:
defaultModelBehavior: Set to"prefer-mcp"to prioritize MCP modelspreferredMcpModel.server: Name of the MCP server (must matchmcp_config.json)preferredMcpModel.model: Model name from yourmcp.jsonconfiguration
Step 3: Create Windsurf Instructions
Create custom instructions to ensure Windsurf uses your local model:
%USERPROFILE%\.codeium\windsurf\instructions.md (Windows)
~/.codeium/windsurf/instructions.md (macOS/Linux)
Always use my local MCP server named "ollama" with the model "gpt-oss" for all reasoning, coding, and problem-solving tasks unless I explicitly request another model.
Prefer the MCP server over any cloud or paid model.
Step 4: Verify Installation
- Restart Windsurf to load the new configuration (Ctrl-Shift; Search for "Developer: Reload Window"; Then hit Enter)
- Check MCP Status: Look for the Ollama MCP server in Windsurf's MCP panel
- Test Connection: Try a simple query to verify the model responds
- Check Logs: Review Windsurf logs if connection issues occur
Troubleshooting
Server Not Appearing:
- Verify
mcp_config.jsonsyntax is valid JSON - Ensure
disabledis set tofalse - Check that Python and the module are in your PATH
- Restart Windsurf completely
Model Not Available:
- Verify the model name in
settings.jsonmatchesmcp.json - Ensure Ollama is running (
ollama serve) - Check that the model is pulled (
ollama pull gpt-oss)
Connection Errors:
- Verify
OLLAMA_HOSTenvironment variable if using custom host - Check Ollama server logs for errors
- Ensure no firewall blocking localhost connections
๐ฏ Usage Examples
VS Code Integration
Add to your VS Code MCP settings:
{
"mcpServers": {
"ollama": {
"command": "py",
"args": ["-m", "mcp_ollama_python"],
"disabled": false
}
}
}
Chat with a Model
# MCP clients can invoke:
{
"tool": "ollama_chat",
"arguments": {
"model": "llama3.2:latest",
"messages": [
{ "role": "user", "content": "Explain quantum computing" }
]
}
}
Generate Embeddings
{
"tool": "ollama_embed",
"arguments": {
"model": "nomic-embed-text",
"input": ["Hello world", "Embeddings are great"]
}
}
๐๏ธ Architecture
This server uses a hot-swap autoloader pattern:
src/
โโโ main.py # Entry point (82 lines)
โโโ server.py # MCP server creation
โโโ autoloader.py # Dynamic tool discovery
โโโ ollama_client.py # Ollama HTTP client
โโโ types.py # Type definitions
โโโ response_formatter.py # Response formatting
โโโ tools/ # Tool implementations
โโโ chat.py # Each exports tool_definition
โโโ generate.py
โโโ ...
Key Benefits:
- Add new tools by dropping files in
src/tools/ - Zero server code changes required
- Each tool is independently testable
- 100% function coverage on all tools (planned)
๐งช Development
Setup
# Clone repository
git clone <repository-url>
cd mcp-ollama-python
# Install dependencies
py -m poetry install
# Run in development mode
py -m poetry run mcp-ollama-python
# Run tests (when implemented)
py -m poetry run pytest
Adding a New Tool
- Create
src/tools/your_tool.py:
from typing import Dict, Any
from ..types import ToolDefinition, ResponseFormat, OllamaClient
from ..response_formatter import format_response
async def your_tool_handler(
ollama: OllamaClient, args: Dict[str, Any], format: ResponseFormat
) -> str:
"""Your tool implementation"""
# Implementation here
return format_response({"result": "success"}, format)
# Tool definition
tool_definition = ToolDefinition(
name="ollama_your_tool",
description="Your tool description",
input_schema={
"type": "object",
"properties": {
"param": {"type": "string"}
},
"required": ["param"]
},
)
- Done! The autoloader discovers it automatically.
Note on Windsurf IDE
Ask codeium to use local MCP Ollama with gpt-oss model for LLM tasks, it will create a memory of this preference, which looks like this:
Always use local MCP Ollama with gpt-oss model for LLM tasks #llm_preference #mcp_ollama #gpt-oss #workflow #user_preference
Edit When the user requests LLM-based tasks (explanations, code generation, analysis, etc.), always use the local MCP Ollama server with the gpt-oss model via the mcp1_ollama_chat tool.
Important guidelines:
Use mcp1_ollama_chat tool with model parameter set to "gpt-oss" Format responses in markdown for better readability (format: "markdown") Communicate with the model in English unless the user explicitly requests another language Do NOT create separate Python scripts to interact with Ollama - use the MCP tools directly The local Ollama server runs at http://127.0.0.1:11434 (default OLLAMA_HOST) Example usage:
mcp1_ollama_chat( model="gpt-oss", messages=[{"role": "user", "content": "Your prompt here"}], format="markdown" ) This applies to tasks like:
Code explanations Documentation generation Technical analysis Q&A about code or concepts Any other LLM-powered assistance Project context: mcp-ollama-python (C:\myCode\gitHub\mcp-ollama-python)
๐ค Contributing
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Write tests - We maintain comprehensive test coverage
- Commit with clear messages (
git commit -m 'Add amazing feature') - Push to your branch (
git push origin feature/amazing-feature) - Open a Pull Request
Code Quality Standards
- All new tools must export
tool_definition - Maintain comprehensive test coverage
- Follow existing Python patterns
- Use Pydantic schemas for input validation
๐ License
This project is licensed under the MIT License.
See LICENSE for details.
๐ Related Projects
- ollama-mcp (TypeScript) - Original TypeScript implementation
- Ollama - Get up and running with large language models locally
- Model Context Protocol - Open standard for AI assistant integration
- Windsurf - AI-powered code editor with MCP support
- Cline - VS Code AI assistant
Made with โค๏ธ using Python, Poetry, and Ollama
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_ollama_python-1.0.2.tar.gz.
File metadata
- Download URL: mcp_ollama_python-1.0.2.tar.gz
- Upload date:
- Size: 23.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a31a44e6d976c4f5d7bb738fa623f3e39bf8f56a92d2ecf88d410ae720852d8
|
|
| MD5 |
4c3c300437ac0bfbc04f453dd810350f
|
|
| BLAKE2b-256 |
d3a0ad31d702a5b9c0a0fcd4ac61d9cab232483458cfa215b2bb13932f6152be
|
File details
Details for the file mcp_ollama_python-1.0.2-py3-none-any.whl.
File metadata
- Download URL: mcp_ollama_python-1.0.2-py3-none-any.whl
- Upload date:
- Size: 27.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a10d60bb96d8b44ee610546086c7eca5ff298de98036f0ef3522e9e801382b8
|
|
| MD5 |
af56881a4e3f77543b35ff6e11b2c9cb
|
|
| BLAKE2b-256 |
82266c3cb728f3f7714ee45ec485d27aaa4b2a633e13852be4ab6283cfe20cf6
|