An MCP server that offloads cheap work from your cloud LLM agent to a local Ollama model.

These details have not been verified by PyPI

Project links

Project description

ollama-handoff

An MCP server that offloads cheap work from your cloud LLM agent to a local Ollama model.

Your frontier model (Claude, GPT, etc.) is brilliant and metered. A lot of the work it gets handed — summarizing a log, drafting a commit message, pulling every URL out of a file, a quick first-pass code review — doesn't need frontier reasoning at all. ollama-handoff exposes your local Ollama instance as a handful of purpose-built MCP tools, so your agent can route that work to a model on your own GPU — at zero cloud cost — and spend its (paid) reasoning budget on the things that actually need it.

This isn't a generic "wrap the Ollama API" server. Each tool ships with a baked-in system prompt and a description written for the calling agent, so the agent knows when to hand off and gets a tuned result back without re-stating instructions every call.

Why you'd want this

💸 Spend less. Routine offloads run locally and bill nothing.
⚡ Keep the big model focused. Summaries, extractions, and drafts don't eat its context or your budget.
🧠 Tuned, not raw. summarize_local, code_review_local, draft_commit_message_local, and extract_local come with reviewer/summarizer/extractor system prompts already dialed in.
🔌 Drop-in. One MCP registration; works with Claude Code, Claude Desktop, Cursor, and any MCP client.
🪶 Tiny & auditable. Two dependencies (mcp, httpx), fully typed, unit-tested, no telemetry.

Requirements

Ollama running locally (ollama serve) with at least one model pulled, e.g. ollama pull qwen2.5-coder:14b.
Python 3.11+ (or just uvx, which manages it for you).

Install

The fastest path is uv — no manual venv needed:

uvx ollama-handoff          # run directly
# or
pip install ollama-handoff  # then run: ollama-handoff

Claude Code

claude mcp add ollama-handoff -- uvx ollama-handoff

Claude Desktop / Cursor (`mcp` config block)

{
  "mcpServers": {
    "ollama-handoff": {
      "command": "uvx",
      "args": ["ollama-handoff"],
      "env": {
        "OLLAMA_DEFAULT_MODEL": "qwen2.5-coder:14b"
      }
    }
  }
}

Tools

Tool	What it does	When the agent should reach for it
`ask_local`	One-shot prompt to the local model	Any handoff that doesn't need frontier reasoning
`chat_local`	Multi-turn local chat	Handoffs needing more than one turn of context
`summarize_local`	Structured summary (headline + bullets)	Long files, logs, transcripts, docs
`code_review_local`	Quick first-pass review of a diff/code	Cheap pre-filter before a deep review
`draft_commit_message_local`	Conventional commit message from a diff	Routine commits
`extract_local`	Pull structured items from unstructured text	URLs, function names, error codes, TODOs
`list_models`	List locally available Ollama models	Discovery / choosing a model
`server_info`	Report the effective configuration	Debugging setup

Configuration

All configuration is via environment variables set in your MCP registration:

Variable	Default	Description
`OLLAMA_URL`	`http://localhost:11434`	Base URL of the Ollama server
`OLLAMA_DEFAULT_MODEL`	`qwen2.5-coder:14b`	Default model for handoffs
`OLLAMA_NUM_CTX`	`32768`	Context window in tokens
`OLLAMA_KEEP_ALIVE`	`30m`	How long to keep the model resident in VRAM
`OLLAMA_TIMEOUT_S`	`600`	Per-request timeout, seconds

Example

Once registered, you don't call the tools yourself — your agent does. A typical exchange:

You: Summarize the errors in build.log and draft a commit for the staged fix.

Agent: (calls summarize_local(build.log, focus="errors and stack traces") and draft_commit_message_local(git diff --staged) — both run on your GPU, nothing billed) → returns the summary + commit message.

Development

git clone https://github.com/Michael-WhiteCapData/ollama-handoff
cd ollama-handoff
uv pip install -e ".[dev]"
ruff check .
pytest          # tests use httpx.MockTransport — no running Ollama required

See CONTRIBUTING.md. Contributions welcome — especially new specialized handoff tools.

License

MIT © Michael Tierney

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 21, 2026

0.1.0

Jun 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_handoff-0.1.1.tar.gz (12.2 kB view details)

Uploaded Jun 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ollama_handoff-0.1.1-py3-none-any.whl (10.6 kB view details)

Uploaded Jun 21, 2026 Python 3

File details

Details for the file ollama_handoff-0.1.1.tar.gz.

File metadata

Download URL: ollama_handoff-0.1.1.tar.gz
Upload date: Jun 21, 2026
Size: 12.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"26.04","id":"resolute","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ollama_handoff-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`2cdf9406a870f31d972fd27e8dc0e47c67d6614ea1f155cb887b8ae26c653bc6`
MD5	`e93a173932a8a8108f04055db23a5483`
BLAKE2b-256	`419b277ddf44d43f6da6a41e4e42a794e98c2da3d35858611fe7b75fdf263562`

See more details on using hashes here.

File details

Details for the file ollama_handoff-0.1.1-py3-none-any.whl.

File metadata

Download URL: ollama_handoff-0.1.1-py3-none-any.whl
Upload date: Jun 21, 2026
Size: 10.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"26.04","id":"resolute","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for ollama_handoff-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`694b531f7207da2a44509b63bc2023ebafb7de15f5cc2e44044ac2630307c3a7`
MD5	`315ae4bc58d35a3177d5f084144d7692`
BLAKE2b-256	`55ee8ef608f08ce830bae79af0712be584e6460c353af0deef93af769271f032`

See more details on using hashes here.

ollama-handoff 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ollama-handoff

Why you'd want this

Requirements

Install

Claude Code

Claude Desktop / Cursor (`mcp` config block)

Tools

Configuration

Example

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

ollama-handoff 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ollama-handoff

Why you'd want this

Requirements

Install

Claude Code

Claude Desktop / Cursor (mcp config block)

Tools

Configuration

Example

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Claude Desktop / Cursor (`mcp` config block)