OCP Router — hybrid local/cloud model routing layer for Open Context Protocol

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

ocp-router

Hybrid local/cloud model routing layer for Open Context Protocol.

Scores each request for complexity and dispatches it to the right model tier — local model for simple tasks, paid provider for complex reasoning. Vendor-neutral: works with any backend that implements the ModelBackend protocol.

Installation

pip install ocp-router                   # core — Ollama local backend included
pip install ocp-router[anthropic]        # + Anthropic Claude paid backend
pip install ocp-router[openai]           # + OpenAI paid backend

Requires Python 3.11+ and a running Ollama instance for local model support.

Quick start

# One-time: install Ollama and pull a model
brew install ollama          # macOS — see ollama.com for other platforms
ollama pull llama3.2
ollama serve

import asyncio
from ocp_router import make_router

async def main():
    router = make_router()   # reads all config from env vars

    # Simple request — handled locally, no paid API call
    result = await router.route("explain the verify_token function")
    print(result.route_to)                    # "local"
    print(result.classify.complexity_score)   # 0.0
    print(result.model)                       # "llama3.2"
    print(result.text)

    # Complex request — escalated to paid provider
    result = await router.route(
        "review security vulnerabilities across all API endpoints"
    )
    print(result.route_to)                    # "paid"
    print(result.classify.complexity_score)   # 0.55
    print(result.classify.signals)            # ["security-sensitive"]
    print(result.model)                       # "claude-sonnet-4-6"
    print(result.text)

asyncio.run(main())

How routing works

Every request passes through the TaskClassifier before reaching any model. The classifier scores complexity from 0.0 (trivial) to 1.0 (maximum) using five deterministic heuristic layers — no model required, runs in microseconds.

Request prompt
      │
      ▼
┌─────────────────────────────────────────────┐
│  TaskClassifier                             │
│                                             │
│  1. Token length      (tiktoken cl100k)     │
│  2. Code block size   (fenced ``` blocks)   │
│  3. Complex signals   security +0.55        │
│                       architecture +0.55    │
│                       migration +0.55       │
│                       deadlock +0.40        │
│                       multi-file +0.35      │
│                       refactor +0.25  ...   │
│  4. Simple signals    explain -0.10         │
│                       summarise -0.10       │
│                       search -0.10    ...   │
│  5. File references   3-4 files +0.10       │
│                       5+ files  +0.20       │
│                                             │
│  score = clamp(sum, 0.0, 1.0)              │
└──────────────┬──────────────────────────────┘
               │
       ┌───────┴────────┐
  score < 0.5      score ≥ 0.5
       │                │
       ▼                ▼
  Local model      Paid provider
  (Ollama)         (Claude / GPT-4 / any)

What goes where

Request	Score	Route
"explain this function"	0.00	local
"what does add() do?"	0.00	local
"find all usages of db.connect"	0.00	local
"summarise the last session"	0.00	local
"refactor the login function"	0.25	local
— threshold (default 0.5) —
"refactor auth across all files"	0.80	paid
"review security vulnerabilities"	0.55	paid
"design the payment architecture"	0.55	paid
"debug this production deadlock"	0.60	paid
"migrate the database schema"	0.55	paid

Threshold is configurable via OCP_ROUTE_THRESHOLD.

RouteResult — what you get back

Every router.route() call returns a RouteResult with the answer and a full trace of the routing decision:

@dataclass
class RouteResult:
    text: str                  # the model's response
    route_to: str              # "local" or "paid"
    classify: ClassifyResult   # full classification trace
    model: str                 # exact model identifier used
    prompt_tokens: int
    completion_tokens: int
    duration_ms: float

@dataclass
class ClassifyResult:
    complexity_score: float    # 0.0 – 1.0
    task_type: str             # "explain" | "refactor" | "debug" | "architect" | ...
    signals: list[str]         # which heuristics fired
    route_to: str              # "local" or "paid"

Configuration

All options are set via environment variables — no code changes needed.

Local backend

Variable	Default	Description
`OCP_LOCAL_BACKEND`	`ollama`	Backend type. `ollama` is the only built-in option.
`OCP_LOCAL_MODEL`	`llama3.2`	Model name passed to Ollama.
`OCP_OLLAMA_URL`	`http://localhost:11434`	Ollama base URL. Point to a remote GPU box if needed.
`OCP_LOCAL_TIMEOUT`	`60`	Inference timeout in seconds.

Paid backend

Variable	Default	Description
`OCP_PAID_BACKEND`	`anthropic`	Backend type: `anthropic` or `openai`.
`OCP_PAID_MODEL`	`claude-sonnet-4-6`	Model identifier for the paid provider.
`OCP_PAID_MAX_TOKENS`	`4096`	Max tokens for paid responses.
`ANTHROPIC_API_KEY`	(required)	API key for Anthropic backend.
`OPENAI_API_KEY`	(required)	API key for OpenAI backend.

Router

Variable	Default	Description
`OCP_ROUTE_THRESHOLD`	`0.5`	Complexity score at or above which requests go to paid.

# Example: Mistral locally, GPT-4o for complex tasks, stricter threshold
OCP_LOCAL_MODEL=mistral \
OCP_PAID_BACKEND=openai \
OCP_PAID_MODEL=gpt-4o \
OCP_ROUTE_THRESHOLD=0.6 \
python my_agent.py

IDE integration (Claude Code, Cursor, Windsurf)

Add the routing env vars to your .mcp.json — OCP handles the rest:

{
  "mcpServers": {
    "ocp": {
      "command": "uvx",
      "args": ["ocp-server"],
      "env": {
        "OCP_DB_PATH": "${workspaceFolder}/.ocp.db",
        "OCP_LOCAL_MODEL": "llama3.2",
        "OCP_OLLAMA_URL": "http://localhost:11434",
        "OCP_PAID_BACKEND": "anthropic",
        "OCP_ROUTE_THRESHOLD": "0.5"
      }
    }
  }
}

Simple tasks (explain, search, summarise) are answered locally by Ollama. Complex requests (security, architecture, multi-file refactor) escalate to your paid provider. Your IDE workflow is unchanged.

Supported local models

Any model available in Ollama works. Recommended starting points:

Model	Size	Good for
`llama3.2`	2B	Classification, summarisation, simple Q&A
`phi4-mini`	3.8B	Code explanation, short answers
`mistral`	7B	Context compression, draft generation
`codellama`	7B	Code-specific tasks

ollama pull llama3.2

Bring your own backend

Both the local and paid slots accept any object that implements the ModelBackend protocol — three methods, no base class required:

from ocp_router import OCPRouter, TaskClassifier
from ocp_router.backends.base import GenerateRequest, GenerateResponse

class MyVLLMBackend:
    @property
    def model(self) -> str:
        return "mistral-7b-instruct"

    async def is_available(self) -> bool:
        return True   # check your endpoint

    async def generate(self, request: GenerateRequest) -> GenerateResponse:
        # call your inference endpoint
        ...
        return GenerateResponse(
            text="...",
            model=self.model,
            prompt_tokens=0,
            completion_tokens=0,
            duration_ms=0.0,
        )

# Plug in directly — no factory change needed
router = OCPRouter(
    local=MyVLLMBackend(),
    paid=MyVLLMBackend(),   # or any other backend
    classifier=TaskClassifier(),
)

This is the intended extension point. ocp-router ships OllamaBackend, AnthropicBackend, and OpenAIBackend as convenience implementations — not as the only options.

Running tests

# Unit tests — no Ollama or API keys required
pytest packages/ocp-router/tests/ -k "not integration" -v

# Integration test — requires: ollama serve + ollama pull llama3.2
pytest packages/ocp-router/tests/ -m integration -v

What's next

ocp.prompt.prepare — local SLM compresses and optimises prompts before they reach the paid provider, reducing token usage and improving answer quality

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rajesh1213

Release history Release notifications | RSS feed

0.2.1

May 14, 2026

This version

0.2.0

May 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocp_router-0.2.0.tar.gz (14.5 kB view details)

Uploaded May 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ocp_router-0.2.0-py3-none-any.whl (14.9 kB view details)

Uploaded May 14, 2026 Python 3

File details

Details for the file ocp_router-0.2.0.tar.gz.

File metadata

Download URL: ocp_router-0.2.0.tar.gz
Upload date: May 14, 2026
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ocp_router-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`fb61d67252d03a4128469f9d967b9b4990d041683c8dfe29f090e730af096e2c`
MD5	`2e31d84374c6bb74a31840bad114d39a`
BLAKE2b-256	`0e0303408642b000bcbbcaa2a49f4bd93bcb59924e39eaadd853e2725356b533`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ocp_router-0.2.0.tar.gz:

Publisher: publish.yml on Rajesh1213/OCP

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ocp_router-0.2.0.tar.gz
- Subject digest: fb61d67252d03a4128469f9d967b9b4990d041683c8dfe29f090e730af096e2c
- Sigstore transparency entry: 1541552482
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: Rajesh1213/OCP@940434aaf13245aacaae181acc3f0236d19b0377
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Rajesh1213
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@940434aaf13245aacaae181acc3f0236d19b0377
- Trigger Event: push

File details

Details for the file ocp_router-0.2.0-py3-none-any.whl.

File metadata

Download URL: ocp_router-0.2.0-py3-none-any.whl
Upload date: May 14, 2026
Size: 14.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ocp_router-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d65be76642cd84cf79fadee70e26b30ed14874e19d355dd62948754c9a7f0b8c`
MD5	`1b83ab9e071ea3453a115b88d3ad57ff`
BLAKE2b-256	`237c1a7a0cd58e3576bdb81b96e528f5221930dcf8814094c530ece54026f4dd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ocp_router-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Rajesh1213/OCP

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ocp_router-0.2.0-py3-none-any.whl
- Subject digest: d65be76642cd84cf79fadee70e26b30ed14874e19d355dd62948754c9a7f0b8c
- Sigstore transparency entry: 1541552515
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: Rajesh1213/OCP@940434aaf13245aacaae181acc3f0236d19b0377
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Rajesh1213
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@940434aaf13245aacaae181acc3f0236d19b0377
- Trigger Event: push

ocp-router 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

ocp-router

Installation

Quick start

How routing works

What goes where

RouteResult — what you get back

Configuration

Local backend

Paid backend

Router

IDE integration (Claude Code, Cursor, Windsurf)

Supported local models

Bring your own backend

Running tests

What's next

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance