Academic figure agent harness for multi-step planning, generation, and evaluation through MCP
Project description
๐จ Academic Figures MCP
A multi-step academic figure agent harness for AI agents and non-engineers.
Academic Figures MCP is a workflow harness for multi-step academic reasoning and figure production. PMID ingestion is one structured entry point, but the real product value is helping an agent move through academic planning, concept decomposition, figure-type selection, prompt orchestration, image generation, evaluation, and iteration until it reaches a publication-grade result. MCP exposure and VSX packaging make that workflow usable without requiring engineering-heavy setup.
Introduction Visual
This hero visual is self-generated by the repository's own MCP workflow and is placed here intentionally so new visitors can see the product story immediately.
Why This Exists
Generating academic figures normally requires:
- Manual prompt engineering โ๏ธ
- Journal standard research ๐
- Color code lookup ๐จ
- Quality self-review โ
- Retry loops ๐
This harness automates those steps and exposes them through MCP so agents can work through the academic reasoning process in an orderly way. The image API is supporting infrastructure; the product value is the structured workflow that helps an agent plan and produce academic-grade figures.
It now includes a YAML-backed journal registry so the MCP layer can inject figure requirements for targets such as Nature, Science, JAMA, NEJM, and Lancet without forcing the agent to memorize house rules.
MCP Surface
This server targets the modern MCP Python SDK line and is intended to expose:
- 5 execution tools for planning, generic generation, editing, evaluation, and batch workflows
- resources for discovery of presets, templates, and Gemini image defaults
- reusable prompts for figure planning and style transformation
Harness Flow
The system is designed as a multi-step academic workflow:
- Start from a structured source such as a PMID, an academic objective, or a figure revision request.
- Reason about the scientific concept, communication goal, and target figure type.
- Organize the request into a structured plan using academic constraints and journal conventions.
- Generate the figure through the provider layer.
- Evaluate the result against academic-quality criteria.
- Iterate until the output is publication-grade.
MCP Tools
| Tool | Input | Output |
|---|---|---|
plan_figure |
pmid, figure_type?, style_preset? |
Structured plan with route, constraints, and next-step arguments |
generate_figure |
planned_payload or compatibility pmid bridge |
Generated asset from a generic render request (now respects render_route=composite_figure) |
edit_figure |
image_path, feedback |
Refined image via Gemini edit API |
evaluate_figure |
image_path, figure_type? |
8-domain scorecard with suggestions |
batch_generate |
pmids: list, figure_type? |
Batch generation results |
composite_figure |
panels, labels, title, caption?, citation? |
Publication-ready multi-panel montage with labels and DPI metadata |
list_manifests |
limit? |
Recent manifest metadata for replay or retargeting |
replay_manifest |
manifest_id, output_dir? |
Re-run a saved manifest using the original prompt and provider |
retarget_journal |
manifest_id, target_journal, output_dir? |
Regenerate with a new journal profile plus before/after diff |
generate_figure is now internally plan-first. If you pass a PMID directly, the server first builds a planning payload and then renders from that payload. The canonical contract remains plan_figure followed by generate_figure(planned_payload=...).
Reproducibility & Retargeting
- Every successful generation now writes a manifest to
.academic-figures/manifests(override withAFM_MANIFEST_DIR). list_manifests+replay_manifestlet you rerun saved prompts without rebuilding the plan.retarget_journalinjects a new journal profile, regenerates, and returns a before/after diff of the profile metadata.
Multi-Panel & Composite Assembly
planned_payloadnow acceptsrender_route=composite_figurewith apanelslist to assemble montage figures.- The built-in
composite_figuretool remains available for direct multi-panel assembly with labels, caption, and DPI metadata.
Product Positioning
The core differentiator is not simply "connected to an image model API".
- It is a complete academic-figure agent harness.
- It helps agents reason through academic concepts before they generate.
- It is exposed through MCP so multiple AI hosts can drive the same workflow.
- It is packaged as a VSX experience so non-engineers can adopt it quickly.
- Provider integrations such as Google Gemini or OpenRouter are replaceable infrastructure behind that harness.
Competitive Landscape
The current GitHub- and web-based benchmark is documented in docs/competitive-landscape.md.
That document separates:
- direct competitors
- adjacent reusable wheels
- strengths worth absorbing
- core product differences we should not copy away
Project Documents
Key repo-level documents:
- ROADMAP.md for planned capabilities and sequencing
- CHANGELOG.md for notable project changes
- docs/competitive-landscape.md for market and positioning context
Generated Visuals & QA
The following three visuals were generated by this repository's own MCP workflow and then reviewed through the built-in evaluate_figure path.
This section is intentionally self-hosting: each image below was generated from the payload files under .academic-figures/jobs, and each QA report was produced by this same repository through scripts/start_afm_local.py run evaluate against the generated output image. These are not manually drawn marketing assets or hand-written review notes.
Introduction Visual QA
QA summary:
- Score:
5.0/5.0 - Strengths: clear story from academic input to MCP workflow hub to publication-grade outputs
- Critical issues: none identified
- Full report: repo-intro-hero-eval.json
Architecture Visual
QA summary:
- Score:
5/5 - Strengths: explicit DDD layering, clear
Presentation -> Application -> Domain <- Infrastructuredirection, and repo-specific integration edges - Critical issues: none identified
- Full report: repo-architecture-v2-eval.json
Workflow Visual
QA summary:
- Score:
4.6/5 - Strengths: one clean main path, strong readability, high visual polish, and the duplicate
PAYLOADerror is removed in v2 - Critical issues: no formal citation or source attribution is shown inside the figure
- Full report: repo-workflow-flowchart-eval.json
Quick Install
git clone https://github.com/u9401066/academic-figures-mcp.git
cd academic-figures-mcp
uv sync
# then copy env.example to env and fill one provider key,
# or provide GOOGLE_API_KEY / OPENROUTER_API_KEY through your shell or MCP host config
Local Env File
For local runs and smoke tests, copy env.example to env and fill exactly one provider section.
Supported formats:
KEY=valueexport KEY=valueset KEY=value
Provider examples:
AFM_IMAGE_PROVIDER=googlewithGOOGLE_API_KEYAFM_IMAGE_PROVIDER=openrouterwithOPENROUTER_API_KEYAFM_IMAGE_PROVIDER=ollamawithOLLAMA_BASE_URLandOLLAMA_MODELAFM_MANIFEST_DIR=.academic-figures/manifeststo relocate persisted generation manifests
Smoke Test
You can run a sanitized end-to-end smoke test with:
uv run python scripts/env_smoke_test.py env
The script only prints variable presence and a compact result summary. It never prints API key values.
Usage
VS Code Copilot
Add to your Copilot MCP settings (.vscode/mcp.json):
{
"servers": {
"academicFigures": {
"type": "stdio",
"envFile": "${workspaceFolder}/env",
"command": "uv",
"args": [
"run",
"--project",
"${workspaceFolder}",
"python",
"-m",
"src.presentation.server"
]
}
}
}
This launch shape is shell-neutral and works across Windows, macOS, and Linux as long as uv is installed. It also keeps the project root explicit through --project ${workspaceFolder} while loading secrets from the repo-root env file via envFile.
Manual Local Startup
Cross-platform launcher:
uv run python scripts/start_afm_local.py server
Run the first figure directly through afm-run:
uv run python scripts/start_afm_local.py run generate --pmid 41657234 --language zh-TW --output-size 1024x1536
This direct --pmid path is a compatibility bridge. It now performs the planning step internally before rendering.
Inject a journal profile explicitly when you want the planner and renderer to enforce a house style:
uv run python scripts/start_afm_local.py run plan --pmid 41657234 --target-journal Nature
uv run python scripts/start_afm_local.py run generate --pmid 41657234 --target-journal JAMA
Run generic asset generation through the same public tool using a JSON payload file:
uv run python scripts/start_afm_local.py run generate --payload-file .academic-figures/jobs/icon-request.json --output-dir .academic-figures/outputs
The same wrapper also supports direct planning and evaluation:
uv run python scripts/start_afm_local.py run plan --pmid 41657234
uv run python scripts/start_afm_local.py run evaluate --image-path .academic-figures/outputs/your-file.png
Windows PowerShell shortcut:
powershell -NoProfile -ExecutionPolicy Bypass -File scripts/start_afm_local.ps1 server
Then just ask:
- "Generate a flowchart for PMID 41657234"
- "Help me plan the right academic figure structure for PMID 41657234 before generating it"
- "ๅนซๆๅ PMID 41657234 ็ consensus flowchart"
- "What figure type should I use for PMID 34567890?"
- "Help me turn this academic concept into a publication-grade figure plan"
The VS Code extension can now run plan, generate, transform, and evaluate commands directly through afm-run instead of copying prompts into chat.
Claude Code / Cursor / Any MCP Host
Any MCP-compatible agent can use these tools directly.
For local development with the newer MCP SDK transport options, the server defaults to stdio, and can also be started with MCP_TRANSPORT=streamable-http for HTTP-based inspection workflows.
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Your AI Agent โ VS Code Copilot, Claude Code,
โ (Copilot, Claude, โ OpenClaw, Hermes, etc.
โ any MCP host) โ
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ MCP stdio / streamable-http
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Academic Figures MCP โ
โ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ plan_figure โ โ
โ โ generate_figure โ โ
โ โ edit_figure โ โ 5 Tools
โ โ evaluate_figure โ โ
โ โ batch_generate โ โ
โ โโโโโโโโโโฌโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโผโโโโโโโโโโโโโโ โ
โ โ Core Orchestrator โ โ
โ โ โ โ
โ โ 1. fetch_paper() โ โ โ PubMed E-utilities
โ โ 2. classify_type() โ โ โ Keyword + structured planning heuristics
โ โ 3. build_payload() โ โ โ reusable render request / prompt pack
โ โ 4. generate_image() โ โ โ single public renderer (Google / OpenRouter / Ollama SVG)
โ โ 5. evaluate() โ โ โ 8-domain vision scoring or local critique
โ โ 6. iterate() โ โ โ harness-guided revision loop
โ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Figure Types & Auto-Classification
The MCP auto-classifies papers into optimal figure types:
| Type | Best For | Example Papers |
|---|---|---|
| Flowchart | Consensus, guidelines | "SSC 2026 Sepsis Guidelines" |
| Mechanism | Drug mechanisms, pathways | "Sugammadex encapsulation mechanism" |
| Comparison | RCTs, meta-analyses | "Crystalloid vs Colloid fluid resuscitation" |
| Infographic | Reviews, overviews | "Perioperative fasting consensus" |
| Timeline | Historical, longitudinal | "Evolution of general anesthesia" |
| Anatomical | Surgical techniques, blocks | "Regional anesthesia approaches" |
| Data Visual | PK/PD, dose-response | "Propofol PK modeling" |
Knowledge Base (Included)
This repo ships with 9 curated reference assets:
| File | Content |
|---|---|
prompt-templates.md |
7-block prompt templates for 9 figure types |
anatomy-color-standards.md |
Medical illustration color coding reference |
journal-figure-standards.md |
Nature/Lancet formatting requirements |
journal-profiles.yaml |
Machine-readable journal registry for automatic prompt injection |
gemini-tips.md |
Gemini 3.1 Flash prompt engineering best practices |
model-benchmark.md |
NB2 vs GPT Image 1.5 comparison data |
code-rendering.md |
matplotlib/Python figure generation reference |
scientific-figures-guide.md |
Scientific figure design principles |
ai-medical-illustration-evaluation.md |
8-domain evaluation rubric |
Planned Rendering Ecosystem
This project is no longer framed as a single-route Gemini prompt server. The current design direction is a multi-route figure system:
Matplotlib+SciencePlotsfor deterministic, publication-style chartsD2+Mermaidfor structured diagrams and editable text-first figure specsFigureFirst+CairoSVGfor precise multi-panel assembly and exportExcalidrawortldrawas future interactive vector-editing layers inside the VS Code extensionKrokias an optional self-hosted render gateway for compatibility with multiple DSL engines
Development
uv sync
uv run python -m src.presentation.server
The planned Gemini image integration follows the current Google Gen AI SDK pattern:
from google import genai
from google.genai import types
License
Apache License 2.0. See LICENSE.
Composite Engine (Multi-Panel Layout)
The composite module solves Gemini's weakness with multi-panel figures.
Instead of generating a single image with all panels (which often fails on
spatial layout, numbering, and mixed styles), it:
- Generates each panel independently with focused prompts
- Composites them using Pillow with precise pixel-level layout
- Programmatic text overlay โ 100% accurate labels, no misspellings
Composite Usage
from src.infrastructure.composite import CompositeFigure, PanelSpec
from src.server import generate_figure
# Step 1: Generate panels separately
left = generate_figure(pmid="41657234", figure_type="anatomy")
right = generate_figure(pmid="41657234", figure_type="ultrasound")
# Step 2: Composite
comp = CompositeFigure()
comp.add_panel(
PanelSpec(prompt="...", label="A", panel_type="anatomy"),
left["image_path"]
)
comp.add_panel(
PanelSpec(prompt="...", label="B", panel_type="ultrasound"),
right["image_path"]
)
comp.set_title("Interscalene Brachial Plexus Block")
comp.set_citation("PMID 41657234 ยท Regional Anesthesia")
comp.compose("interscalene_block.pdf")
MCP Tool: composite_figure
composite_figure(
panels=[["left.png", "anatomy"], ["right.png", "ultrasound"]],
labels=["A", "B"],
title="..."
)
Layout Specs
| Property | Value |
|---|---|
| Canvas | 2400 ร 1600 px (8" ร 5.33" @ 300 DPI) |
| Format | Double column (~183mm width, Nature standard) |
| Labels | A/B/C with pill-shaped background |
| Footer | Caption + PMIDs + citation |
| Divider | Vertical line between panels |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file academic_figures_mcp-0.3.1.tar.gz.
File metadata
- Download URL: academic_figures_mcp-0.3.1.tar.gz
- Upload date:
- Size: 54.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4c131bdfa087b30cec52856ae9dacc40e5e8bc4d41472507d3ca17f18449fdb
|
|
| MD5 |
df3c872c787c78c192c50155b5016a75
|
|
| BLAKE2b-256 |
565696dd0bb1beb2524d2839abb6847b010b27af1e6ce3d50fb87ba3b5cc24ff
|
Provenance
The following attestation bundles were made for academic_figures_mcp-0.3.1.tar.gz:
Publisher:
publish.yml on u9401066/academic-figures-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
academic_figures_mcp-0.3.1.tar.gz -
Subject digest:
c4c131bdfa087b30cec52856ae9dacc40e5e8bc4d41472507d3ca17f18449fdb - Sigstore transparency entry: 1293426327
- Sigstore integration time:
-
Permalink:
u9401066/academic-figures-mcp@76037145297553e61e41c7ce0d1693aa47f9447a -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/u9401066
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@76037145297553e61e41c7ce0d1693aa47f9447a -
Trigger Event:
push
-
Statement type:
File details
Details for the file academic_figures_mcp-0.3.1-py3-none-any.whl.
File metadata
- Download URL: academic_figures_mcp-0.3.1-py3-none-any.whl
- Upload date:
- Size: 67.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3cf84168dd3fad0fefa7dd919b1dcd1aed23a6132fa1821e5b28b4b781de14b4
|
|
| MD5 |
62746396eeb5ce9444b9534987198875
|
|
| BLAKE2b-256 |
13b04834a250c32bc04f8e1e8869e9cf5d0cd58f806e0d332085fcdaf16e2d60
|
Provenance
The following attestation bundles were made for academic_figures_mcp-0.3.1-py3-none-any.whl:
Publisher:
publish.yml on u9401066/academic-figures-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
academic_figures_mcp-0.3.1-py3-none-any.whl -
Subject digest:
3cf84168dd3fad0fefa7dd919b1dcd1aed23a6132fa1821e5b28b4b781de14b4 - Sigstore transparency entry: 1293426339
- Sigstore integration time:
-
Permalink:
u9401066/academic-figures-mcp@76037145297553e61e41c7ce0d1693aa47f9447a -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/u9401066
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@76037145297553e61e41c7ce0d1693aa47f9447a -
Trigger Event:
push
-
Statement type: