Skip to main content

Claude Code Skill for publication-quality academic diagrams and plots — built on the PaperBanana framework (arXiv:2601.23265)

Project description

PaperBanana CC

A Claude Code Skill/Plugin for generating publication-quality academic diagrams and plots.

Built entirely upon PaperBanana — this project re-implements the full pipeline from the original research as a Claude Code-native skill, where Claude Code itself acts as the orchestrating VLM (Vision-Language Model) for every stage: paper analysis, reference retrieval, structured planning, venue-specific styling, and iterative vision-based critique.

Based On

This project is a Claude Code adaptation of the following works:

Paper PaperBanana: A Multi-Modal Academic Figure Generation Framework
Original Repos dwzhu-pku/PaperBanana (research prototype) · llmsresearch/paperbanana (production package)

All prompts, reference datasets, style guides, and pipeline design are derived from the above. PaperBanana CC restructures them into a Claude Code Skill so that Claude Code subscribers can generate academic figures through natural conversation — no separate VLM server or complex setup required.

Why Claude Code?

Claude Code already includes a state-of-the-art VLM with tool use, vision, and long-context capabilities. Instead of running a separate inference server, PaperBanana CC leverages Claude Code as both the reasoning engine and the orchestrator:

  • Zero infrastructure — no GPU server, no model downloads, no Docker
  • Full context across phases — the Critic can compare the generated image against the original methodology, not just the final prompt
  • Interactive refinement — the user confirms direction at each stage via natural conversation
  • Plugin distribution — install once, use in any project

Image Generation Options

Method Requirement Best For
OpenAI API (gpt-image-1) OPENAI_API_KEY Highest quality diagrams
Gemini API (Imagen 3) GOOGLE_API_KEY Fast iteration, flexible ratios
Manual generation Any image AI subscription No API key needed — use ChatGPT, Gemini, or any web-based image generator

The manual generation mode is designed for users who have a subscription to Google AI Pro, ChatGPT Plus, or similar services but don't want to set up API keys. PaperBanana CC generates the optimized prompt, you paste it into your preferred web UI, and provide the resulting image back for critique.

Pipeline

[Phase 0] Input Enrichment
  Methodology text + caption → 7-axis structuring + 6-spec caption enhancement
  ← User confirms direction

[Phase 1] Reference Retrieval
  538 reference examples → 2-axis semantic matching → top-10 with images
  ← User selects references

[Phase 2] Plan → Style → Generate
  Venue (NeurIPS/ICML/ACL/IEEE) + generation method selection
  → 7-item structured plan → venue-specific styling → image generation
  Prompt always displayed and saved

[Critic Loop] Iterative Refinement (default 3 rounds)
  Claude Code vision critique → revision or completion
  ← User can provide additional feedback at any point

All phases except image generation are performed directly by Claude Code. Prompts and style guides from the original PaperBanana research are used to instruct Claude Code at each stage.

Installation

Requirements

  • Claude Code subscription (Claude Pro / Max / Team)
  • Python 3.12+
  • uv package manager
  • Optional: OpenAI API key and/or Google API key (not required for manual generation)

Setup

# 1. Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Clone & install dependencies
git clone https://github.com/eunsan-jo/paperbanana-cc.git
cd paperbanana-cc && uv sync

# 3. (Optional) Set API keys — not required for manual generation mode
echo "OPENAI_API_KEY=sk-..." >> .env
echo "GOOGLE_API_KEY=AI..." >> .env

# 4. Open your project and launch Claude Code with the plugin
cd /path/to/your-project
claude --plugin-dir /path/to/paperbanana-cc

Once loaded, the skill is available as /paperbanana-cc:paperbanana. Reference data (~300MB) is downloaded automatically on first run.

Note: This plugin includes Python dependencies for image generation and plot execution, so git clone + uv sync is required. claude --plugin-dir loads the skill definitions into your Claude Code session.

Usage

Since Claude Code runs inside your project, PaperBanana CC can read your codebase directly — no copy-pasting needed. Just point it at the relevant source files:

> /paperbanana

Analyze src/model/transformer.py and generate an architecture diagram
showing the data flow from input embedding through the attention layers.

Caption: "Overview of the proposed multi-modal transformer architecture."

Claude Code reads the code, understands the architecture, enriches the context, retrieves similar reference figures, and generates a publication-ready diagram — all within your project's context.

More examples

# Generate from a method description
/paperbanana A 4-layer CNN with batch normalization for image classification

# Create a plot from experimental results
/paperbanana Plot the accuracy comparison from results/ablation.csv

# Analyze multiple files for a system diagram
/paperbanana Read src/pipeline/ and create a data flow diagram of the entire inference pipeline

Supported Venues

Venue Diagram Style Plot Style
NeurIPS Clean, minimal, blue-tone Publication DPI, colorblind-safe
ICML Similar to NeurIPS Same conventions
ACL NLP-focused conventions Same conventions
IEEE Two-column optimized IEEE figure standards

License

Apache-2.0

Citation

If you use PaperBanana CC in your research, please cite the original paper:

@article{zhu2025paperbanana,
  title={PaperBanana: A Multi-Modal Academic Figure Generation Framework},
  author={Zhu, Dewei and others},
  journal={arXiv preprint arXiv:2601.23265},
  year={2025}
}

Acknowledgments

This project is built entirely upon the PaperBanana framework by dwzhu-pku and llmsresearch. All core prompts, reference datasets, evaluation criteria, and style guides originate from their work. PaperBanana CC restructures these into a Claude Code Skill to make the pipeline accessible to Claude Code subscribers without additional infrastructure.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paperbanana_cc-0.1.0.tar.gz (120.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paperbanana_cc-0.1.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file paperbanana_cc-0.1.0.tar.gz.

File metadata

  • Download URL: paperbanana_cc-0.1.0.tar.gz
  • Upload date:
  • Size: 120.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for paperbanana_cc-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7eedc752a2264dd506766050de7584e8f2369b781738bd8c2c2decb1e22db5cf
MD5 cbd7398de1849c048be87b084d7a191f
BLAKE2b-256 0f11ff31ce43a034cf3226b318def1910599b5cc131782c710ce88287f835891

See more details on using hashes here.

File details

Details for the file paperbanana_cc-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: paperbanana_cc-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.25 {"installer":{"name":"uv","version":"0.9.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for paperbanana_cc-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e95ad8710a52097877cafa64d8fbd41601f00d68505e881d61f5c0068ee60ec2
MD5 ffc5699498cb725052fbb2a789d659de
BLAKE2b-256 47df1a340aaa0d973d75ff5360687cf5f6a6ef204a5856d64daa67a0e8e4ab3f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page