MCP server that compresses OCR-heavy PDFs into dense packed images for AI agent workflows.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Chrboebel

These details have not been verified by PyPI

Project description

Optical Context MCP logo

Optical Context MCP

Compress OCR-heavy PDFs into dense packed images so agents can work with long visual documents.

Optical Context MCP is built for one specific job: turning large, visually structured PDFs into a smaller set of retrievable packed images for agent workflows.

It reads a local PDF, runs OCR with Mistral, recomposes the extracted text and figures into dense PNGs, and exposes those artifacts over MCP for batch retrieval.

What It Does

reads a local PDF from the MCP host machine
extracts page markdown and embedded images with Mistral OCR
packs that content into dense PNGs that preserve visual grouping
stores a manifest and job artifacts for follow-up retrieval
lets an agent pull only the packed images it needs

Where It Fits

Use it for:

operating manuals
scanned handbooks
product catalogs
PDF slide decks
visually structured OCR-heavy documents

Skip it for:

tiny PDFs
clean text-native PDFs where normal extraction is enough
workflows that require exact page-faithful rendering
cases where OCR cost is not justified

Example Result

The image below shows a real local validation run on a public research paper with dense text, figures, charts, and page-level visual structure. The packed image on the right consolidates the seven source pages shown on the left.

Side-by-side comparison of original pages and the generated packed output

Example local run facts from the generated manifest:

source paper pages: 22
previewed source page range: 15 to 21
extracted images: 30
packed output images: 6
example packed image size: 986x1084
example packed image file size: 536,697 bytes

This example shows the intended workflow: take a long, visually structured PDF and compress it into a smaller set of retrievable packed images that still preserve the visual structure of the source.

Install

python -m pip install optical-context-mcp

Run without installing:

uvx optical-context-mcp

MISTRAL_API_KEY is required for compress_pdf

For pinned shared setups:

uvx --from optical-context-mcp==0.1.3 optical-context-mcp

Run

Default transport is stdio:

optical-context-mcp

Claude Code

claude mcp add -s project optical-context -- uvx optical-context-mcp

Typical use:

call compress_pdf
inspect the returned manifest
fetch packed images with get_packed_images

MCP Tools

compress_pdf: run OCR plus recomposition and create a stored job
get_job_manifest: load metadata for an existing job
get_packed_images: fetch one or more packed PNGs from an existing job

How It Works

flowchart LR
    A["Local PDF"] --> B["Mistral OCR"]
    B --> C["Page markdown + embedded images"]
    C --> D["Recomposition engine"]
    D --> E["Dense packed PNG images"]
    E --> F["Stored job artifacts"]
    F --> G["Agent fetches manifest or image batches over MCP"]

Why Packed Images Instead Of Just OCR Text

section grouping
table-like layout
captions near figures
visual adjacency between text and embedded graphics

For many vision-capable agents, that is a better intermediate format than a plain OCR dump.

Current Scope

depends on Mistral OCR
currently handles local file paths, not remote uploads
optimized for compression and retrieval, not final polished markdown generation
quality depends on OCR quality and the visual density of the source document

Roadmap

make the OCR layer provider-agnostic so different OCR backends can be swapped behind the same MCP workflow

Development

uv venv --python /opt/homebrew/bin/python3.11 .venv
uv pip install --python .venv/bin/python -e ".[dev]"
.venv/bin/python -m pytest

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Chrboebel

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

Mar 8, 2026

This version

0.1.3

Mar 7, 2026

0.1.2

Mar 7, 2026

0.1.1

Mar 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optical_context_mcp-0.1.3.tar.gz (13.9 kB view details)

Uploaded Mar 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

optical_context_mcp-0.1.3-py3-none-any.whl (15.4 kB view details)

Uploaded Mar 7, 2026 Python 3

File details

Details for the file optical_context_mcp-0.1.3.tar.gz.

File metadata

Download URL: optical_context_mcp-0.1.3.tar.gz
Upload date: Mar 7, 2026
Size: 13.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for optical_context_mcp-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`ed8bab9f31f8345c008ac59037c044d4832cf5c204743b3e742afc8fdccc78ce`
MD5	`0bf7e9c14aad273ac327a15f7497155f`
BLAKE2b-256	`1af66d3a34c0a6238ab4dafa3c401f40160c2d3bfb1550f93840961dc74a39e0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for optical_context_mcp-0.1.3.tar.gz:

Publisher: publish-pypi.yml on ChrBoebel/optical-context-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: optical_context_mcp-0.1.3.tar.gz
- Subject digest: ed8bab9f31f8345c008ac59037c044d4832cf5c204743b3e742afc8fdccc78ce
- Sigstore transparency entry: 1058556557
- Sigstore integration time: Mar 7, 2026
Source repository:
- Permalink: ChrBoebel/optical-context-mcp@3ecd47fabcbbecc6ada69936f437992dd2f9e5aa
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ChrBoebel
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@3ecd47fabcbbecc6ada69936f437992dd2f9e5aa
- Trigger Event: workflow_dispatch

File details

Details for the file optical_context_mcp-0.1.3-py3-none-any.whl.

File metadata

Download URL: optical_context_mcp-0.1.3-py3-none-any.whl
Upload date: Mar 7, 2026
Size: 15.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for optical_context_mcp-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c7d43cd24d5222b7f352eee3029b70e8397fd4633119d7595b5d4636c239a2f0`
MD5	`320354b1ee2cbfe2d1b72df18a30d4b7`
BLAKE2b-256	`bca84713d74e871213f08061140eeaece3a51464c00245c72bf65eb6f4f7e48b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for optical_context_mcp-0.1.3-py3-none-any.whl:

Publisher: publish-pypi.yml on ChrBoebel/optical-context-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: optical_context_mcp-0.1.3-py3-none-any.whl
- Subject digest: c7d43cd24d5222b7f352eee3029b70e8397fd4633119d7595b5d4636c239a2f0
- Sigstore transparency entry: 1058556570
- Sigstore integration time: Mar 7, 2026
Source repository:
- Permalink: ChrBoebel/optical-context-mcp@3ecd47fabcbbecc6ada69936f437992dd2f9e5aa
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ChrBoebel
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@3ecd47fabcbbecc6ada69936f437992dd2f9e5aa
- Trigger Event: workflow_dispatch

optical-context-mcp 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Optical Context MCP

What It Does

Where It Fits

Example Result

Install

Run

Claude Code

MCP Tools

How It Works

Why Packed Images Instead Of Just OCR Text

Current Scope

Roadmap

Development

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance