Skip to main content

jMRI — token-efficient context retrieval for MCP agents. Python SDK client and reference server.

Project description

mcp-retrieval-spec

The jMRI (jMunch Retrieval Interface) specification — an open interface standard for token-efficient context retrieval in MCP servers.


What Is jMRI?

Agents that read whole files to answer specific questions waste 99% of their token budget. jMRI is a minimal interface for MCP servers that do retrieval right: index once, search by intent, retrieve exactly what you need.

Four operations. One response envelope. Two compliance levels.

The problem in numbers: A typical FastAPI codebase costs ~42,000 tokens to read naively. jMRI retrieval of the same answer costs ~480 tokens. At $3/1M tokens, that's $0.126 vs. $0.0014 per query. Across millions of queries, the savings are material.

The jMunch tools have saved 12.4 billion tokens across user sessions as of March 2026. This spec is the formal definition of what they do.


How It Works

Agent
  │
  ├─ discover()    → What knowledge sources are available?
  ├─ search(query) → Which symbols/sections are relevant? (IDs + summaries only)
  ├─ retrieve(id)  → Give me the exact source for this ID.
  └─ metadata(id?) → What would naive reading have cost?

Every response includes a _meta block with tokens_saved and total_tokens_saved. Agents can see exactly what they're saving on every call.


Spec

SPEC.md

The full jMRI v1.0 specification. Apache 2.0. Implement it however you want.


Reference Implementations

The spec is open. The best implementations are commercial.

Implementation Domain Stars Install
jCodeMunch Code (30+ languages) 900+ uvx jcodemunch-mcp
jDocMunch Docs (MD, RST, HTML, notebooks) 45+ uvx jdocmunch-mcp

Both implement jMRI-Full. Licenses available at https://j.gravelle.us/jCodeMunch/


Quick Start

Using the Python SDK

from sdk.python.mri_client import MRIClient

client = MRIClient()  # connects to local jcodemunch-mcp

# List available repos
sources = client.discover()

# Search
results = client.search("database session dependency", repo="fastapi/fastapi")
for r in results:
    print(r["id"], r["summary"])

# Retrieve
symbol = client.retrieve(results[0]["id"], repo="fastapi/fastapi")
print(symbol["source"])
print(f"Tokens saved: {symbol['_meta']['tokens_saved']:,}")

Claude Code Integration

Add to your ~/.claude.json:

{
  "mcpServers": {
    "jcodemunch-mcp": {
      "command": "uvx",
      "args": ["jcodemunch-mcp"]
    },
    "jdocmunch-mcp": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"]
    }
  }
}

See examples/claude-code/ for full setup.

Cursor Integration

See examples/cursor/.


Repo Structure

mcp-retrieval-spec/
├── SPEC.md                    # The jMRI specification (Apache 2.0)
├── CHANGELOG.md               # Spec version history
├── LICENSE                    # Spec: Apache 2.0. Reference impls: commercial.
├── reference/
│   ├── server.py              # Minimal jMRI-compliant server
│   └── config.example.json   # Sample configuration
├── sdk/
│   ├── python/mri_client.py  # Python client helper (Apache 2.0)
│   └── typescript/mri-client.ts
├── examples/
│   ├── claude-code/           # Claude Code integration
│   ├── cursor/                # Cursor integration
│   └── generic-agent/         # Minimal jMRI agent
└── benchmark/                 # munch-benchmark suite

Licensing

Component License
SPEC.md Apache 2.0 — implement freely
SDK clients Apache 2.0 — use freely
Reference server Requires jMunch license for commercial use
Benchmark suite Apache 2.0

This is the Stripe model: the API spec is open and well-documented; the best implementation is commercial.


Benchmark

munch-benchmark

Clone and run in under 5 minutes. Compares Naive, Chunk RAG, and jMRI on FastAPI and Flask. Results are honest: if RAG beats jMRI on a metric, it's reported.

Real numbers on FastAPI (950K naive tokens):

Method Avg Tokens Cost/Query Precision
Naive (read all files) 949,904 $2.85 100%
Chunk RAG 330,372 $0.99 74%
jMRI 480 $0.0014 96%

1,979x fewer tokens than naive. Higher precision than RAG.


Contributing

The spec is intentionally minimal. PRs that extend the core interface require strong justification. PRs that improve examples, fix errors, or add language-specific SDK clients are welcome.

Open an issue before proposing spec changes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jmri_sdk-1.0.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file jmri_sdk-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: jmri_sdk-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for jmri_sdk-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1bfb93684a770cb473b7cfda57635af6be5465d50757e4ace25d05c08ff46d3
MD5 ee0ff3e091203d4679220fe9d127ddcb
BLAKE2b-256 3d25d14986ed0c0489af18f8b1f6b83053515ca9e67421680aa0d8dad44bac52

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page