Frame Check MCP server: deterministic structural framing analysis for AI-generated documents, with opt-in frame-divergence block per FRAME_DIVERGENCE_CONTRACT_v1 c1.0. MCP surface delegates V4.2 judgment to the caller's agent model; zero Frame Check LLM cost per query.
Project description
Frame Check
See what any document does not show you.
Frame Check is a structural framing analysis tool and a public research program on how documents, especially AI-generated ones, frame reality. It names which perspectives a document takes, which it omits, and how it positions the reader. Numerical claims are cross-checked against authoritative sources where coverage exists.
Live: https://frame.clarethium.com (paused 2026-04-23 pending 0.8.0 V4.2-first launch). Source is maintained in this repository; the MCP server runs locally per the install instructions below.
Quickstart (MCP server)
The PyPI package frame-check-mcp is the Model Context Protocol
server. It runs locally and gives any MCP-compatible AI client
(Claude Desktop, Cursor, etc.) deterministic structural framing
analysis as a tool.
pip install frame-check-mcp
Then point your MCP client at the installed entry point. For
Claude Desktop, add to claude_desktop_config.json:
{
"mcpServers": {
"frame-check": {
"command": "frame-check-mcp"
}
}
}
Restart the client. Then in any conversation: "Can you frame-check this document?" Full install + verification details are in the MCP server section below.
What it does
Paste a document and Frame Check returns:
- A structural framing profile: which of five analytical perspectives (causes, risks, stakeholders, trends, uncertainty) the document covers, which it omits, and the density of each.
- Voice and epistemic posture: how the document positions the reader, and what share of claims are attributed to sources.
- Temporal orientation: whether the document grounds its conclusions in historical data, present state, or projections.
- Frame Vocabulary Standard candidate matches: named frame patterns
whose rule-based signals fire on the text, each with identification
cues and worked examples. Matches are candidate-level; precision
against multi-source labeling is an active research question
(
fvs_eval/validation_study/REPORT.md). - Source-network verification: numeric claims checked against SEC EDGAR, FRED, World Bank, REST Countries, Alpha Vantage, and Wolfram Alpha where those providers have coverage.
- An optional AI narrative (Grok) interpreting framing at prose level. Labelled distinctly so readers do not conflate language-model interpretation with deterministic measurement.
The comparison mode (/compare) runs the same analysis on two
documents on the same subject and surfaces structural differences in
framing, certainty, coverage, and sourcing.
Approach
Structural measurement is the floor. Every framing claim the tool makes is computed from deterministic pattern matchers and always returns the same result for the same input. AI-assisted interpretation is available as enrichment where an API key is configured, but is labelled as such and never hidden behind the structural layer.
Verification is bounded. The tool only verifies numeric claims against providers with genuine coverage for the claim type, and it surfaces its own calibration results (precision, recall, F1 per provider) rather than asserting verdicts without evidence.
Named-pattern detection is a separate reliability layer from the
structural profile. The structural framing profile (coverage, voice,
temporal, epistemic) is deterministic regex-based measurement and
ships as the load-bearing product output. Named Frame Vocabulary
Standard matches are rule-based candidates; their agreement with
careful multi-source human labeling is an open, pre-registered
research question with a published negative result
(fvs_eval/validation_study/REPORT.md). Each match ships with a
teaching question, not a verdict; the product treats candidate
matches as hypotheses for the reader to evaluate, not as detector
verdicts.
A separate pre-registered question (Track B, fvs_eval/reader_aid_study/)
tests whether the construct-honest signals (under-detection markers,
density caveats, confidence states) actually help a reader see
framing they would otherwise miss. Until Track B runs, the
construct-honesty posture is an internally-coherent design claim,
not yet an externally-validated one. The validation program named
in VALIDATION_PROGRAM.md
in the repository documents the triggers under which Track B
activates; the honest current state is that Track A falsified,
Track B is pending, and the construct-honesty pivot is the
load-bearing claim at v0.
Full methodology: https://frame.clarethium.com/corpus/methodology/
Running locally
Requirements: Python 3.12 (matches the production Docker image
python:3.12-slim; build tooling uses f-string features available
only on 3.12+). The dependencies in requirements.txt.
pip install -r requirements.txt
uvicorn app:app --host 127.0.0.1 --port 8001 --reload
Open http://127.0.0.1:8001.
API keys are optional. Without any keys the structural analysis, the
Frame Vocabulary Standard matches, and the framing profile all work.
With a Gemini or xAI key set as GEMINI_API_KEY / XAI_API_KEY, the
AI narrative and per-model generation paths become available.
Tests
python3 run_tests.py
Runs 45 suites (26 script-style + 19 pytest-style) covering parsing, framing detection, source-network verification, compare save/load, SSE streaming, page rendering, circuit-breaker persistence, LLM-cost measurement, origin protection, reframe, mirror, routing, trust-tier weighting, calibration substrate, and decision-readiness byte- stability. Approximately 100 to 200 seconds depending on which external API fixtures are exercised.
MCP server: the sovereignty instrument for AI conversations
Frame Check ships a Model Context Protocol server so AI agents (Claude Desktop, Cursor, any MCP-compatible client) can use structural framing analysis directly. The defining use case is not "analyse a document I have." It is "see the frame my AI just put on a question."
Consider a life question of the kind LLMs answer thousands of times a day: "is Bitcoin a good investment," "should I leave my job," "what should I tell my child." The LLM returns a prescriptive answer. The answer has a structural frame: a voice, a coverage footprint, a stance on risks and uncertainty, a set of named frame patterns from the Frame Vocabulary Standard. Frame Check surfaces that frame. The user sees the frame their AI chose and decides what to do with the seeing. That is the sovereignty case, and the category of tools that surface it to the user at the moment of the conversation is close to empty.
Two tools drive this:
frame_check: analyse a document. Optionalsource_textargument unlocks Layer 4 source-fidelity verification against the material the document was supposed to ground in. Optionalinclude_divergence=trueunlocks thedivergenceblock (see "Frame divergence" below). Used three ways: analyse a document you have, self-audit the agent's own last response, or analyse another AI's response the user pastes in.frame_compare: compare two documents on the same subject. Returns per-document summaries plus the cross-document signal: shared blind spots, unique coverage gaps, voice / temporal / epistemic deltas, and a structured framing-differences narrative. The worked example four-llms-on-bitcoin-retirement-2026 demonstrates a multi-model application: same prompt, four frontier models, four materially different framing signatures.
Both tools return the same three-section epistemic payload:
analysis: the measurements themselves (coverage, voice, temporal, epistemic, frame-library matches with adjacent-frame MCP URIs and per-entry versions, extracted claims, plus cross-document comparison forframe_compare).agent_guidance: what the tool can and cannot tell the agent, and how to cite the output faithfully. Verdict patterns ("biased," "balanced," "A is better than B") are explicitly prohibited; structure is surfaced, the user judges.provenance: methodology version, measurement-stack version, frame-library version, license (Apache-2.0 code, CC-BY-4.0 corpus), citation string, ISO-8601 timestamp, tool URL. Zero LLM cost; the analysis layer is deterministic end-to-end.
Four MCP prompts encode the defining use patterns, each as a
server-defined template the agent's LLM executes directly (zero
additional cost on our side). All four pass
include_divergence=true on the tool call (or walk the divergence
block if already present), surface the FVS catalog entries the
document did not use, and honor
agent_guidance.absence_is_not_prescription:
frame_check_my_response: agent self-audit. The agent callsframe_checkon its own last response and surfaces the frame without verdict or defensive rewriting. The load-bearing sovereignty prompt.frame_check_this_ai_response: the user pastes a response from another AI; the agent analyses what that AI did structurally.challenge_document: adversarial questions traced to specific structural signals (low sourcing, missing stakeholders, promotional voice, reader-relevant absent frames from the divergence block).explain_framing: walkthrough template for a completed frame_check result, divergence walkthrough included when the result carries the block.
In addition to tools and prompts, the server exposes MCP resources so an agent can read the Frame Vocabulary Standard, the methodology paper, the worked-examples corpus, the curated research transmissions from blog.clarethium.com, the calibration evidence, and the validation corpus + aggregate findings behind the (experimental, gated) decision-readiness profile directly:
frame-check://library- the full library index (status + adjacency); the citable map of the Frame Vocabulary Standard.frame-check://library/FVS-001throughFVS-020- every entry as markdown source.frame-check://worked-examples- the collection-level index.frame-check://worked-examples/{slug}- applied analyses of specific public documents.frame-check://transmissions- the transmissions collection index (curated research pieces from blog.clarethium.com).frame-check://transmissions/{slug}- individual research transmissions served locally because a Cloudflare edge layer blocks automated fetches to the public blog.frame-check://methodology- the complete methodology spec.frame-check://calibration/reliability_tiers- per-provider F1, precision, recall, and tier from the most comprehensive calibration run on this deploy.frame-check://calibration/runs/{run_id}/report- narrative calibration report for a specific run.frame-check://calibration/runs/{run_id}/verdicts- per-claim verdicts from that run (the evidence chain behind the tier).frame-check://calibration/runs/{run_id}/tiers- that run's per-provider tier snapshot.frame-check://corpus/{slug}- a validation-corpus entry's source document as markdown. The validation corpus is the set of documents on which the decision-readiness profile is measured; entries are slug-addressed and stable.frame-check://corpus/{slug}/profile- the entry's decision-readiness profile (JSON): per-dimension findings (coverage, calibration, evidence, robustness, counterfactual) plus the library entries each finding cites (fired_library_entries).frame-check://corpus/{slug}/peer/{partner_slug}andframe-check://corpus/{slug}/diff/{partner_slug}- per-pair comparison artifacts when the same prompt was answered by multiple sources. The peer artifact carries the side-by-side numerical view; the diff artifact carries the annotated framing-level interpretation.frame-check://aggregate/latest- cross-question outlier findings for the most recent aggregate run, listing the library entries and corpus entries each finding cites. The citation chainaggregate -> corpus -> librarylets an agent follow any aggregate finding back to the source document and the named frame pattern that fired on it.
The decision-readiness profile is currently labelled experimental until the Phase 2 rater study lands; the external-rater invitation contract and agreement criteria live in RATERS.md in the repository.
Tools are verbs, resources are nouns. An agent that detects
FVS-008 in a document can follow up by reading the library entry
verbatim rather than scraping the public site, and a verification
tier claim can be cited with the exact per-claim verdicts that
justified it. The URIs are stable; they are the citation targets
an agent can hand back to the user. Every tool response also
carries provenance.analysis_timestamp_utc in ISO-8601 so
citations can pin the exact moment the measurement was made.
The agent_guidance and provenance blocks are the novel part:
an agent passing Frame Check's output to a user without attribution
would strip the reproducibility that makes the measurement worth
citing. Shipping "how to cite faithfully" inside the tool response
carries the integrity forward to the user.
Frame divergence (AGI-era primitive for perspective expansion)
Frame divergence is the category Frame Check claims: naming frames
a document could have used but did not, with faithfulness
constraints so surfaced absences carry citation and do not become
prescription. FRAME_DIVERGENCE_v1.md (Part 1) positions it as the
AGI-era primitive for perspective expansion; the interface is
pinned by FRAME_DIVERGENCE_CONTRACT_v1.md (Part 2 c1.0).
frame_check emits a top-level divergence block when the caller
passes include_divergence=true. The block carries absent_frames
(FVS catalog entries the V1 detector did not match, each with
citation_uri, absence_basis, and domain_relevance_rationale)
and an envelope (spec version, catalog version, V4.2 engine status,
faithfulness note, limitations). Two keys land on agent_guidance:
how_to_render_divergence (caller-side composition instructions)
and absence_is_not_prescription (the guarantee divergence never
tells the user which frames they should have used).
The MCP server does not invoke an LLM for divergence; V4.2
judgment is delegated to the caller's agent model per the engine-
tier recommendations in
ENGINE_TIER_RECOMMENDATIONS_v1.md.
provenance.analysis_cost_usd stays at 0.0; vendor-independence
is automatic because the caller chooses the model. Optional inputs
on frame_check that control the block: domain_hint,
divergence_rendering, catalog_version_pin. Full shape is
documented in MCP_SERVER.md (bundled in the wheel) under
"Divergence block," and the
frame-check://spec/frame-divergence/v1/part-1 and part-2
resources serve the canonical references.
The forward commitment is that 0.8.0 on PyPI will be V4.2-capable
by default: frame_check output includes the divergence block
when V4.2 data is available and the caller's agent runs V4.2
judgment with its own LLM. Current state is internal pre-0.8.0
with include_divergence defaulted to true; the earlier 0.7.1
V1-only PyPI plan was retired 2026-04-23 in favor of V4.2-first
launch discipline. See MCP_SERVER.md "Release arc" for the
canonical release commitments. Repository
CHANGELOG.md
tracks per-version commitments.
The architectural superset of v1 lives at FRAME_DIVERGENCE_v2.md in the repository (2026-04-25): a layered taxonomy (L0 receiver state, L1 atomic axes, L2 simultaneous composites, L2.5 sequential chains, L3 reality construct) plus a five-stage lifecycle (detect, diverge, chain, converge, ground) and nine cross-cutting design principles. v1 c1.0 contract carries forward unchanged; v2 absorbs the narrow definition as one operation within the wider architecture. The paper-shaped extract is FRAME_DIVERGENCE_v2_SUMMARY.md in the repository.
Install in Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"frame-check": {
"command": "python3",
"args": ["/absolute/path/to/frame-check/mcp_server.py"]
}
}
}
Restart Claude Desktop. Then: "Can you frame-check this document?"
Install in Cursor or other MCP clients
Same pattern; the server speaks standard MCP over stdio. Protocol
version 2024-11-05. No external dependency on an MCP SDK; the
protocol surface is implemented in-repo in mcp_server.py so the
install is just a path in a config file.
Offline sanity check
python3 mcp_server.py --test
Runs the analyzer on a short sample and pretty-prints the full epistemic payload. Useful for verifying the pipeline wiring without an MCP client.
Install fingerprint
python3 mcp_server.py --version
Emits a single-line fingerprint (server version, protocol version,
git SHA with +dirty flag when the working tree has uncommitted
changes, frame-library version, validation-corpus slug count and
hash, Python version, and the absolute script path). Run this
before a Claude Desktop session that depends on recent repo
changes to confirm the configured MCP server is the expected one.
The corpus_hash matches the suffix on the most recent
validation/decision_readiness/results/{date}-{hash}/ run
directory when the validation tree is present.
Troubleshooting
If Claude Desktop shows "server not responding" or the tools do not appear after a restart, check the following before doing anything more aggressive.
The config path. Claude Desktop reads
claude_desktop_config.json from a platform-specific location:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Edit that file directly; Claude Desktop does not expose config through its UI.
Absolute paths. The args entry must be an absolute path to
mcp_server.py. Relative paths and ~ are not expanded. Use
the full path printed by realpath mcp_server.py in the
Frame Check repo.
Python version. The server runs on Python 3.10+ (needs
str | None syntax). If the system python3 is older, point
command at a specific interpreter:
{
"mcpServers": {
"frame-check": {
"command": "/usr/local/bin/python3.11",
"args": ["/absolute/path/to/frame-check/mcp_server.py"]
}
}
}
stderr logs. Every MCP request and error is logged to
stderr as [frame-check-mcp] .... Claude Desktop captures
server stderr and exposes it in its "View Developer Tools"
window under the server name. An empty log means the server
never started (check the command path); a log line about a
failed import means the Frame Check repo is not at the path
given in args.
Sanity check before configuring the client. Run
python3 mcp_server.py --test from the repo directory first.
If this does not produce a JSON payload, no MCP client will
succeed either; fix the pipeline wiring before editing the
client config.
Corpus and research artifacts
The Frame Check corpus regenerates on each deploy and is also exported as daily NDJSON and Parquet archives.
- Frame Vocabulary Standard: https://frame.clarethium.com/corpus/library/
- Methodology paper: https://frame.clarethium.com/corpus/methodology/
- Observatory snapshot: https://frame.clarethium.com/corpus/observatory/
- Calibration results: https://frame.clarethium.com/corpus/calibration/
- Worked examples: https://frame.clarethium.com/corpus/worked-examples/
- Decision-readiness profile and validation corpus (experimental, methodology published, profile pending Phase 2 rater study): https://frame.clarethium.com/corpus/decision-readiness/
- Corpus archive: https://corpus.clarethium.com/frame-check/
- FVS-Eval (in development):
fvs_eval/SPEC.md. Structural framing benchmark for language models derived from Frame Check's detection engine. v0 specification only; v0.1 corpus and harness targeted for mid-2026. Intended audience: labs, researchers, eval practitioners.
Documentation map: where to start by audience
The wheel bundles the runtime documents the MCP server cites: methodology, MCP server spec, frame divergence specs, the V4.2 gap inventory, the FVS catalog (frame_library and the contract-pinned v3 snapshot), worked examples, transmissions, calibration results, and the validation corpus + aggregate findings.
The repository carries additional documentation (strategy doctrine, governance, reviewer protocols, validation studies, audits, historical artifacts) that is not bundled with the wheel but is public at https://github.com/lluvr/frame-check. The map below indexes the most-relevant documents by audience; documents marked with [bundled] ship with the wheel, documents marked with [repo] require the repository.
If you are considering contributing or reviewing
- CONTRIBUTING.md [repo]: pull-request mechanics (PR format, tests, commit conventions).
- RATERS.md [repo]: open invitation for external raters on the decision-readiness profile (the Phase 2 validation gate before profile output ships in the live UI).
- STRESS_TEST_ASSESSMENT_v1.md [repo]: the project's self-enumerated strategic critique (twelve audiences, ten claims named as weak-under-scrutiny, five highest-leverage next moves). Read before evaluating the project's posture.
- PUBLISH_READINESS_ASSESSMENT_v1.md [repo]: companion publish-readiness audit (12 perspectives, unified gap inventory, three zones for closing remaining work). Reviewer orientation in one document.
- ANTICIPATED_CRITIQUES.md [repo]: nineteen enumerated adversarial critiques across seven categories with the project's prepared defenses. Where the defense stands, where it lands, where it depends on external work still to come (explicit
[OPEN]flags). If you are preparing a critical read, this is the project's self-enumerated attack surface.
If you want to understand what the system does and how it works
METHODOLOGY.md[bundled]: the method paper; detection methodology, FVS curation, calibration, limitations.MCP_SERVER.md[bundled]: installation, tool surface, resource surface, prompt surface, divergence block, release arc.FRAME_DIVERGENCE_v1.mdandFRAME_DIVERGENCE_CONTRACT_v1.md[bundled]: category definition (Part 1) and interface contract (Part 2 c1.0).V4_2_GAP_INVENTORY_v1.md[bundled]: V4.2 engine readiness inventory, Tier 2 gap status.- METHODOLOGY_PAPER_OUTLINE_v1.md [repo]: the structured companion for the methodology paper v2 drafted with the first engaged academic reader.
- FRAMING_ANALYSIS.md [repo]: detector design specification.
- VERIFICATION_ARCHITECTURE.md [repo]: numeric-claim verification pipeline.
- CALIBRATION_SET.md [repo]: calibration corpus reference.
If you are tracking current project state and decisions
- SESSION_STATE.md [repo]: running changelog of architectural decisions and drift risks.
- NEXT_STEPS.md [repo]: prioritized unbuilt improvements.
- DETECTOR_V2_PROMOTION.md [repo]: active playbook for v2 detector promotion to production.
- VALIDATION_PROGRAM.md [repo]: validation strategy and triggers for formal reactivation.
- VISITOR_AUDIT.md [repo]: real-user-path audit of the live product.
If you are operating the deployed system
- RUNBOOK.md [repo]: operational procedures and incident response.
- CLARETHIUM_MEASURE_SYNC.md [repo]: vault/fork synchronization policy for the measurement module.
- MCP_TYPESCRIPT_SCOPE.md [repo]: TypeScript MCP port scope.
If you are tracking the v2 architecture and the post-2026-04-25 substrate phase
- FRAME_DIVERGENCE_v2.md [repo]: layered architecture spec (L0 receiver state, L1 atomic axes, L2 simultaneous composites, L2.5 sequential chains, L3 reality construct) plus five-stage lifecycle (detect, diverge, chain, converge, ground) and nine cross-cutting design principles. Shipped 2026-04-25; v2.1 increment same date.
- FRAME_DIVERGENCE_v2_SUMMARY.md [repo]: paper-shaped extract for citation and external readers.
FRAME_DIVERGENCE_CONTRACT_v1.md[bundled]: c1.0 interface contract on thedivergenceblock (carries forward unchanged into v2).- ANCHOR_AUTHORSHIP_METHODOLOGY_v1.md [repo]: introspective protocol for v2 lived-experience anchor authoring (Petitmengin/Vermersch micro-phenomenology lineage; corroboration discipline; construct-honest absence as first-class outcome). Shipped 2026-04-26.
- CLAIM_A_PROTOCOL_v1.md [repo]: pre-registered empirical protocol for v2 Prediction P3 (chain-output beats single-frame-output on decision quality).
CLAIM_A_SOURCE_A_GUIDE_v1.mdis the operator-actionable sourcing companion (~60 concrete pre-decision documents). - CROSS_CURATOR_OUTREACH_v1.md [repo]: operator-actionable kit for cross-curator stress-testing (six candidate practitioner domain profiles, three outreach email templates, engagement protocol, findings format).
- CHANGELOG.md [repo]: canonical release record. Currently tracks
0.8.0unreleased (V4.2-first launch; divergence block default-on; absence_clusters substrate-side composition; corpus_intelligence module). - SECURITY.md [repo]: security policy and disclosure path.
Historical reference
- archive/ [repo]: superseded strategic documents and completed-phase artifacts (pre-canon-play decisions, phase 1.5 gap analyses, launch drafts, earlier premium-plan drafts, completed bug-investigation diagnostics).
Most critical reads, in order
For wheel-only consumers, start with the bundled MCP_SERVER.md
(install, tools, resources, prompts, divergence) and METHODOLOGY.md
(what is being measured and how). For repository visitors, the
methodology paper
covers what the project measures and how. The repository at
https://github.com/lluvr/frame-check carries the project's full
documentation including strategic doctrine, governance, reviewer
protocols, and canon-promotion dossiers. If preparing a critical read:
STRESS_TEST_ASSESSMENT_v1.md
and
ANTICIPATED_CRITIQUES.md
surface the project's own enumeration of what is weak and what has
not yet been validated.
License
The source code in this repository is licensed under Apache-2.0. See
LICENSE for the full text and NOTICE for attribution.
The corpus artifacts (Frame Vocabulary Standard, methodology paper, observatory dataset, calibration dataset, worked examples) are licensed separately under CC-BY-4.0.
The two licenses intentionally differ: code is Apache-2.0 so it can be audited, forked, cited, and derived; corpus data is CC-BY-4.0 so the measurement artifacts compound as a citeable research resource.
Citing Frame Check
See CITATION.cff at the repository root. A short citation line:
Lucic, L. (2026). Frame Check: a research instrument for framing and verification in documents. https://frame.clarethium.com
Contact
Maintained by Lovro Lucic. https://blog.clarethium.com
Report issues or share findings: hello@clarethium.com