Skip to main content

Academic figure agent harness for multi-step planning, generation, and evaluation through MCP

Project description

๐ŸŽจ Academic Figures MCP

PyPI version VS Code Marketplace CI License Python 3.10+

A multi-step academic figure agent harness for AI agents and non-engineers.

Academic Figures MCP is a workflow harness for multi-step academic reasoning and figure production. PMID ingestion is one structured entry point, but the real product value is helping an agent move through academic planning, concept decomposition, figure-type selection, prompt orchestration, image generation, evaluation, and iteration until it reaches a publication-grade result. MCP exposure and VSX packaging make that workflow usable without requiring engineering-heavy setup.

One-Click Install (VS Code)

Requires uv. The install shape uses uvx --from academic-figures-mcp afm-server, which is shell-neutral across macOS, Linux, and Windows.

Install in VS Code Install in VS Code Insiders

If you want guided setup instead of raw MCP configuration, install the VS Code extension. It supports SecretStorage, env files, and process environment configuration on macOS, Linux, and Windows.

Introduction Visual

Academic Figures MCP introduction hero

This hero visual is self-generated by the repository's own MCP workflow and is placed here intentionally so new visitors can see the product story immediately.

Why This Exists

Generating academic figures normally requires:

  1. Manual prompt engineering โœ๏ธ
  2. Journal standard research ๐Ÿ“š
  3. Color code lookup ๐ŸŽจ
  4. Quality self-review โœ…
  5. Retry loops ๐Ÿ”„

This harness automates those steps and exposes them through MCP so agents can work through the academic reasoning process in an orderly way. The image API is supporting infrastructure; the product value is the structured workflow that helps an agent plan and produce academic-grade figures.

It now includes a YAML-backed journal registry so the MCP layer can inject figure requirements for targets such as Nature, Science, JAMA, NEJM, and Lancet without forcing the agent to memorize house rules.

MCP Surface

This server targets the modern MCP Python SDK line and is intended to expose:

  • 11 MCP tools for planning, generation, editing, evaluation, replay, retargeting, verification, and multi-step refinement workflows
  • resources for discovery of presets, templates, and Gemini image defaults
  • reusable prompts for figure planning and style transformation

Harness Flow

The system is designed as a multi-step academic workflow:

  1. Start from a structured source such as a PMID, an academic objective, or a figure revision request.
  2. Reason about the scientific concept, communication goal, and target figure type.
  3. Organize the request into a structured plan using academic constraints and journal conventions.
  4. Generate the figure through the provider layer.
  5. Evaluate the result against academic-quality criteria.
  6. Iterate until the output is publication-grade.

MCP Tools

Tool Input Output
plan_figure pmid, figure_type?, style_preset? Structured plan with route, constraints, and next-step arguments
generate_figure planned_payload or compatibility pmid bridge Generated asset from a generic render request (now respects render_route=composite_figure)
edit_figure image_path, feedback Refined image via Gemini edit API
evaluate_figure image_path, figure_type? 8-domain scorecard with suggestions
batch_generate pmids: list, figure_type? Batch generation results
composite_figure panels, labels, title, caption?, citation? Publication-ready multi-panel montage with labels and DPI metadata
list_manifests limit? Recent manifest metadata for replay or retargeting
replay_manifest manifest_id, output_dir? Re-run a saved manifest using the original prompt and provider
retarget_journal manifest_id, target_journal, output_dir? Regenerate with a new journal profile plus before/after diff
verify_figure image_path, expected_labels?, figure_type?, language? Standalone quality-gate verdict with domain scores and exact-label verification
multi_turn_edit image_path, instructions[], max_turns? Iterative edit session for progressive refinement without restarting from scratch

generate_figure is now internally plan-first. If you pass a PMID directly, the server first builds a planning payload and then renders from that payload. The canonical contract remains plan_figure followed by generate_figure(planned_payload=...).

Reproducibility & Retargeting

  • Every successful generation now writes a manifest to .academic-figures/manifests (override with AFM_MANIFEST_DIR).
  • list_manifests + replay_manifest let you rerun saved prompts without rebuilding the plan.
  • retarget_journal injects a new journal profile, regenerates, and returns a before/after diff of the profile metadata.

Multi-Panel & Composite Assembly

  • planned_payload now accepts render_route=composite_figure with a panels list to assemble montage figures.
  • The built-in composite_figure tool remains available for direct multi-panel assembly with labels, caption, and DPI metadata.

CJK Text Fidelity & Self-Review

  • plan_figure now accepts expected_labels so exact text strings can be propagated into prompt construction and later verification.
  • Text-heavy CJK requests can be escalated toward higher-fidelity model selection and SVG-oriented routes automatically.
  • generate_figure can return a quality_gate block with domain scores, missing labels, and a pass/fail verdict.
  • verify_figure lets you run the same quality gate independently against any generated image.
  • multi_turn_edit keeps an edit session alive across multiple instructions, which is useful when fixing garbled labels or layout issues iteratively.

Product Positioning

The core differentiator is not simply "connected to an image model API".

  • It is a complete academic-figure agent harness.
  • It helps agents reason through academic concepts before they generate.
  • It is exposed through MCP so multiple AI hosts can drive the same workflow.
  • It is packaged as a VSX experience so non-engineers can adopt it quickly.
  • Provider integrations such as Google Gemini or OpenRouter are replaceable infrastructure behind that harness.

Competitive Landscape

The current GitHub- and web-based benchmark is documented in docs/competitive-landscape.md.

That document separates:

  • direct competitors
  • adjacent reusable wheels
  • strengths worth absorbing
  • core product differences we should not copy away

Project Documents

Key repo-level documents:

Generated Visuals & QA

The following three visuals were generated by this repository's own MCP workflow and then reviewed through the built-in evaluate_figure path.

This section is intentionally self-hosting: each image below was generated from the payload files under .academic-figures/jobs, and each QA report was produced by this same repository through scripts/start_afm_local.py run evaluate against the generated output image. These are not manually drawn marketing assets or hand-written review notes.

Introduction Visual QA

QA summary:

  • Score: 5.0/5.0
  • Strengths: clear story from academic input to MCP workflow hub to publication-grade outputs
  • Critical issues: none identified
  • Full report: repo-intro-hero-eval.json

Architecture Visual

Academic Figures MCP architecture v2

QA summary:

  • Score: 5/5
  • Strengths: explicit DDD layering, clear Presentation -> Application -> Domain <- Infrastructure direction, and repo-specific integration edges
  • Critical issues: none identified
  • Full report: repo-architecture-v2-eval.json

Workflow Visual

Academic Figures MCP workflow flowchart

QA summary:

  • Score: 4.6/5
  • Strengths: one clean main path, strong readability, high visual polish, and the duplicate PAYLOAD error is removed in v2
  • Critical issues: no formal citation or source attribution is shown inside the figure
  • Full report: repo-workflow-flowchart-eval.json

Quick Install

git clone https://github.com/u9401066/academic-figures-mcp.git
cd academic-figures-mcp
uv sync
# then copy env.example to env and fill one provider key,
# or provide GOOGLE_API_KEY / OPENROUTER_API_KEY through your shell or MCP host config

Local Env File

For local runs and smoke tests, copy env.example to env and fill exactly one provider section.

Supported formats:

  • KEY=value
  • export KEY=value
  • set KEY=value

Provider examples:

  • AFM_IMAGE_PROVIDER=google with GOOGLE_API_KEY
  • AFM_IMAGE_PROVIDER=openrouter with OPENROUTER_API_KEY
  • AFM_IMAGE_PROVIDER=ollama with OLLAMA_BASE_URL and OLLAMA_MODEL
  • AFM_MANIFEST_DIR=.academic-figures/manifests to relocate persisted generation manifests

Smoke Test

You can run a sanitized end-to-end smoke test with:

uv run python scripts/env_smoke_test.py env

The script only prints variable presence and a compact result summary. It never prints API key values.

Usage

VS Code Copilot

Recommended package-mode install for macOS, Linux, and Windows users who do not want a local checkout:

{
  "servers": {
    "academicFigures": {
      "type": "stdio",
      "command": "uvx",
      "args": [
        "--from",
        "academic-figures-mcp",
        "afm-server"
      ],
      "env": {
        "AFM_IMAGE_PROVIDER": "google",
        "GOOGLE_API_KEY": "${input:googleApiKey}"
      }
    }
  }
}

For local repository development, add to your Copilot MCP settings (.vscode/mcp.json):

{
  "servers": {
    "academicFigures": {
      "type": "stdio",
      "envFile": "${workspaceFolder}/env",
      "command": "uv",
      "args": [
        "run",
        "--project",
        "${workspaceFolder}",
        "python",
        "-m",
        "src.presentation.server"
      ]
    }
  }
}

This launch shape is shell-neutral and works across Windows, macOS, and Linux as long as uv is installed. It also keeps the project root explicit through --project ${workspaceFolder} while loading secrets from the repo-root env file via envFile.

Manual Local Startup

Cross-platform launcher:

uv run python scripts/start_afm_local.py server

Run the first figure directly through afm-run:

uv run python scripts/start_afm_local.py run generate --pmid 41657234 --language zh-TW --output-size 1024x1536

This direct --pmid path is a compatibility bridge. It now performs the planning step internally before rendering.

Inject a journal profile explicitly when you want the planner and renderer to enforce a house style:

uv run python scripts/start_afm_local.py run plan --pmid 41657234 --target-journal Nature
uv run python scripts/start_afm_local.py run generate --pmid 41657234 --target-journal JAMA

Run generic asset generation through the same public tool using a JSON payload file:

uv run python scripts/start_afm_local.py run generate --payload-file .academic-figures/jobs/icon-request.json --output-dir .academic-figures/outputs

The same wrapper also supports direct planning and evaluation:

uv run python scripts/start_afm_local.py run plan --pmid 41657234
uv run python scripts/start_afm_local.py run evaluate --image-path .academic-figures/outputs/your-file.png

For exact-label generation and post-generation QA on text-heavy figures:

uv run python scripts/start_afm_local.py run plan --pmid 41657234 --language zh-TW --expected-label "่…ฆไธญ้ขจ" --expected-label "่ก€ๆ “็งป้™ค่ก“"
uv run python scripts/start_afm_local.py run verify --image-path .academic-figures/outputs/your-file.png --language zh-TW --expected-label "่…ฆไธญ้ขจ"

Windows PowerShell shortcut:

powershell -NoProfile -ExecutionPolicy Bypass -File scripts/start_afm_local.ps1 server

Then just ask:

  • "Generate a flowchart for PMID 41657234"
  • "Help me plan the right academic figure structure for PMID 41657234 before generating it"
  • "ๅนซๆˆ‘ๅš PMID 41657234 ็š„ consensus flowchart"
  • "What figure type should I use for PMID 34567890?"
  • "Help me turn this academic concept into a publication-grade figure plan"

The VS Code extension can now run plan, generate, transform, and evaluate commands directly through afm-run instead of copying prompts into chat.

Claude Code / Cursor / Any MCP Host

Any MCP-compatible agent can use these tools directly.

Recommended package-mode shape for Claude Desktop or any MCP host that accepts command plus args:

{
  "mcpServers": {
    "academic-figures": {
      "command": "uvx",
      "args": [
        "--from",
        "academic-figures-mcp",
        "afm-server"
      ],
      "env": {
        "AFM_IMAGE_PROVIDER": "google",
        "GOOGLE_API_KEY": "your_google_api_key"
      }
    }
  }
}

If your MCP host prefers a checked-out repository instead of uvx, keep the repo path absolute and use the existing uv run --project /absolute/path/to/academic-figures-mcp python -m src.presentation.server form.

For local development with the newer MCP SDK transport options, the server defaults to stdio, and can also be started with MCP_TRANSPORT=streamable-http for HTTP-based inspection workflows.

Cross-Platform Notes

  • Package mode is the most portable install path: uvx --from academic-figures-mcp afm-server works without shell-specific quoting on macOS, Linux, and Windows.
  • Local checkout mode is also cross-platform: use scripts/start_afm_local.py on macOS/Linux and scripts/start_afm_local.ps1 on Windows PowerShell.
  • Environment parsing already accepts KEY=value, export KEY=value, and set KEY=value, so the same env profile can be reused across Bash, Zsh, Fish-style exports, and PowerShell/CMD-oriented files.
  • The VS Code extension falls back to package mode through uvx when no local source tree is detected, which is the safest route for non-developer users on all three platforms.

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Your AI Agent       โ”‚     VS Code Copilot, Claude Code,
โ”‚  (Copilot, Claude,   โ”‚     OpenClaw, Hermes, etc.
โ”‚   any MCP host)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚  MCP stdio / streamable-http
           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Academic Figures MCP    โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ plan_figure        โ”‚  โ”‚
โ”‚  โ”‚ generate_figure    โ”‚  โ”‚
โ”‚  โ”‚ edit_figure        โ”‚  โ”‚  5 Tools
โ”‚  โ”‚ evaluate_figure    โ”‚  โ”‚
โ”‚  โ”‚ batch_generate     โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚           โ”‚               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚ Core Orchestrator    โ”‚ โ”‚
โ”‚  โ”‚                      โ”‚ โ”‚
โ”‚  โ”‚ 1. fetch_paper()     โ”‚ โ”‚  โ†’ PubMed E-utilities
โ”‚  โ”‚ 2. classify_type()   โ”‚ โ”‚  โ†’ Keyword + structured planning heuristics
โ”‚  โ”‚ 3. build_payload()   โ”‚ โ”‚  โ†’ reusable render request / prompt pack
โ”‚  โ”‚ 4. generate_image()  โ”‚ โ”‚  โ†’ single public renderer (Google / OpenRouter / Ollama SVG)
โ”‚  โ”‚ 5. evaluate()        โ”‚ โ”‚  โ†’ 8-domain vision scoring or local critique
โ”‚  โ”‚ 6. iterate()         โ”‚ โ”‚  โ†’ harness-guided revision loop
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Figure Types & Auto-Classification

The MCP auto-classifies papers into optimal figure types:

Type Best For Example Papers
Flowchart Consensus, guidelines "SSC 2026 Sepsis Guidelines"
Mechanism Drug mechanisms, pathways "Sugammadex encapsulation mechanism"
Comparison RCTs, meta-analyses "Crystalloid vs Colloid fluid resuscitation"
Infographic Reviews, overviews "Perioperative fasting consensus"
Timeline Historical, longitudinal "Evolution of general anesthesia"
Anatomical Surgical techniques, blocks "Regional anesthesia approaches"
Data Visual PK/PD, dose-response "Propofol PK modeling"

Knowledge Base (Included)

This repo ships with 9 curated reference assets:

File Content
prompt-templates.md 7-block prompt templates for 9 figure types
anatomy-color-standards.md Medical illustration color coding reference
journal-figure-standards.md Nature/Lancet formatting requirements
journal-profiles.yaml Machine-readable journal registry for automatic prompt injection
gemini-tips.md Gemini 3.1 Flash prompt engineering best practices
model-benchmark.md NB2 vs GPT Image 1.5 comparison data
code-rendering.md matplotlib/Python figure generation reference
scientific-figures-guide.md Scientific figure design principles
ai-medical-illustration-evaluation.md 8-domain evaluation rubric

Planned Rendering Ecosystem

This project is no longer framed as a single-route Gemini prompt server. The current design direction is a multi-route figure system:

  • Matplotlib + SciencePlots for deterministic, publication-style charts
  • D2 + Mermaid for structured diagrams and editable text-first figure specs
  • FigureFirst + CairoSVG for precise multi-panel assembly and export
  • Excalidraw or tldraw as future interactive vector-editing layers inside the VS Code extension
  • Kroki as an optional self-hosted render gateway for compatibility with multiple DSL engines

Development

uv sync
uv run python -m src.presentation.server

The planned Gemini image integration follows the current Google Gen AI SDK pattern:

from google import genai
from google.genai import types

License

Apache License 2.0. See LICENSE.

Composite Engine (Multi-Panel Layout)

The composite module solves Gemini's weakness with multi-panel figures. Instead of generating a single image with all panels (which often fails on spatial layout, numbering, and mixed styles), it:

  1. Generates each panel independently with focused prompts
  2. Composites them using Pillow with precise pixel-level layout
  3. Programmatic text overlay โ€” 100% accurate labels, no misspellings

Composite Usage

from src.infrastructure.composite import CompositeFigure, PanelSpec
from src.server import generate_figure

# Step 1: Generate panels separately
left = generate_figure(pmid="41657234", figure_type="anatomy")
right = generate_figure(pmid="41657234", figure_type="ultrasound")

# Step 2: Composite
comp = CompositeFigure()
comp.add_panel(
    PanelSpec(prompt="...", label="A", panel_type="anatomy"),
    left["image_path"]
)
comp.add_panel(
    PanelSpec(prompt="...", label="B", panel_type="ultrasound"),
    right["image_path"]
)
comp.set_title("Interscalene Brachial Plexus Block")
comp.set_citation("PMID 41657234 ยท Regional Anesthesia")
comp.compose("interscalene_block.pdf")

MCP Tool: composite_figure

composite_figure(
    panels=[["left.png", "anatomy"], ["right.png", "ultrasound"]],
    labels=["A", "B"],
    title="..."
)

Layout Specs

Property Value
Canvas 2400 ร— 1600 px (8" ร— 5.33" @ 300 DPI)
Format Double column (~183mm width, Nature standard)
Labels A/B/C with pill-shaped background
Footer Caption + PMIDs + citation
Divider Vertical line between panels

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

academic_figures_mcp-0.4.1.tar.gz (62.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

academic_figures_mcp-0.4.1-py3-none-any.whl (76.5 kB view details)

Uploaded Python 3

File details

Details for the file academic_figures_mcp-0.4.1.tar.gz.

File metadata

  • Download URL: academic_figures_mcp-0.4.1.tar.gz
  • Upload date:
  • Size: 62.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for academic_figures_mcp-0.4.1.tar.gz
Algorithm Hash digest
SHA256 b46a2e8032888c1747debe0f29ec2ff417496af9a09da70642c094aa2a320c1b
MD5 c44f70a701550b8b456ec654ee3a843d
BLAKE2b-256 caa6964cde38a24b2850361718b3eadb3f183e83b115ae55433893a86663366d

See more details on using hashes here.

Provenance

The following attestation bundles were made for academic_figures_mcp-0.4.1.tar.gz:

Publisher: publish.yml on u9401066/academic-figures-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file academic_figures_mcp-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for academic_figures_mcp-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2b6b599f8b20a5b0d622e258c02d74ec94df39ede5b6bc9dafd7f44c8843e871
MD5 c49646bd666aff9037d82919d221504f
BLAKE2b-256 c55356a30bccb63687a4afeb0bf82a39aaf8b0a492d1f82eb45d5b417cf25d9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for academic_figures_mcp-0.4.1-py3-none-any.whl:

Publisher: publish.yml on u9401066/academic-figures-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page