Wenqiao (文桥): Markdown as the canonical source for academic writing / 以 Markdown 为唯一真源的学术写作工具
Project description
文桥 · Wenqiao
中文文档 · English
Write once, render anywhere.
One manuscript, many outputs.
Academic Writing Intermediate Format & Multi-target Conversion Tool
Academic writing suffers not from deep thought, but from format entanglement.
Lost in the forest of .tex brackets, wandering in the maze of \begin{} \end{},
you yearn for Markdown's simplicity, yet citations, cross-references, and figure labels
exile you to the wilderness.
Wenqiao is a bridge:
One end connects your thoughts (Markdown's purity),
the other end connects the world's rules (LaTeX's rigor, HTML's openness, rich text's friendliness).
You simply stand at the center, writing a manuscript called.mid.md.
Wenqiao defines a Markdown-based intermediate format (.mid.md) for academic writing.
Write your paper once in plain Markdown with metadata encoded in HTML comments, then
convert to LaTeX, rich Markdown, or self-contained HTML — all from a single
source file.
graph LR
A["paper.mid.md"] --> B["wenqiao"]
B --> C["paper.tex"]
B --> D["paper.html"]
B --> E["paper.md"]
style A fill:#f9f,stroke:#333
style C fill:#ffa,stroke:#333
style D fill:#aff,stroke:#333
style E fill:#afa,stroke:#333
Features
- Multi-target output — LaTeX (
.tex), rich Markdown (.md), and HTML with MathJax - 8 citation commands —
cite,citep,citet,citeauthor,citeyear,textcite,parencite,autocitewith BibTeX file parsing; also supports bare[cite:key]shortcuts - Math — inline
$...$and display$$...$$with labels and equation environments - Cross-references — labels and refs that become
\ref{}/<a href>/{#id}per target; also supports bare[ref:label]shortcuts - Figures & tables — caption, label, width, placement via HTML comment directives;
table captions support inline
[text](ref:label)/[text](cite:key)markup - Smart table layout — auto-scales wide tables with
\scalebox, wraps long cells with\makecell[lt], uses per-column dynamic wrap threshold to prevent page overflow - Environments —
<!-- begin: algorithm -->/<!-- end: algorithm -->blocks - Include TeX —
<!-- include-tex: fragment.tex -->for external LaTeX fragments - AI figure generation — optional pipeline with nanobanana-compatible runners
- 5-layer config — CLI > directives > config file > template > defaults
- i18n —
zh(中文) andenlocale support for figure/table labels
Architecture
flowchart TD
subgraph Input
SRC["paper.mid.md"]
end
subgraph "Parsing Pipeline"
P["Markdown Parser<br/><i>markdown-it-py + plugins</i>"]
C["Comment Processor<br/><i>4-phase directive extraction</i>"]
EAST["Enhanced AST (EAST)<br/><i>32 node types</i>"]
end
subgraph Renderers
LTX["LaTeX Renderer<br/><code>.tex</code>"]
MD["Markdown Renderer<br/><code>.md</code>"]
HTML["HTML Renderer<br/><code>.html</code> + MathJax"]
end
SRC --> P --> C --> EAST
EAST --> LTX
EAST --> MD
EAST --> HTML
Each renderer supports three output modes:
| Mode | LaTeX | Markdown | HTML |
|---|---|---|---|
full |
Preamble + \begin{document} + bibliography |
YAML front matter + body + footnotes | <!DOCTYPE html> + CSS + MathJax CDN |
body |
Content inside \begin{document}...\end{document} |
Body + footnotes (no front matter) | <body> content only |
fragment |
Bare content, headings degraded one level | Bare content | Bare content |
EAST Node Types (32 total)
Block nodes (16):
Document · Heading · Paragraph · Blockquote · List · ListItem · CodeBlock ·
MathBlock · Figure · Table · Environment · RawBlock · ThematicBreak ·
FootnoteDef · HardBreak · SoftBreak
Inline nodes (16):
Text · Strong · Emphasis · CodeInline · MathInline · Link · Image ·
Citation · CrossRef · FootnoteRef · FootnoteDef · SoftBreak · HardBreak ·
RawInline · Strikethrough · Superscript
All nodes extend a base Node class with children, metadata, and position fields.
Getting Started
Prerequisites
- Python 3.12+
- uv package manager (recommended)
Installation
From PyPI:
pip install wenqiao
# Or with uv
uv pip install wenqiao
# With AI figure generation support
pip install wenqiao[figures]
From source (for development):
git clone https://github.com/nerdneilsfield/wenqiao.git
cd wenqiao
uv sync
Quick Start
# Markdown → LaTeX (default)
wenqiao paper.mid.md -o paper.tex
# Explicit convert subcommand (same as above)
wenqiao convert paper.mid.md -o paper.tex
# Markdown → HTML with MathJax
wenqiao paper.mid.md -o paper.html -t html
# Markdown → Rich Markdown
wenqiao paper.mid.md -o paper.md -t markdown
# Validate citations, cross-references, and images
wenqiao validate paper.mid.md --bib refs.bib --strict
# Check formatting (exit 1 if unformatted)
wenqiao format paper.mid.md --check --diff
# Format with change statistics
wenqiao format paper.mid.md --stats
# Read from stdin, body-only mode
cat paper.mid.md | wenqiao - --mode body -o paper.tex
# Dump the Enhanced AST for debugging
wenqiao paper.mid.md --dump-east | jq .
Full CLI Reference
Wenqiao uses subcommands: convert (default), validate, and format.
The convert subcommand is implicit — wenqiao file.mid.md is equivalent to
wenqiao convert file.mid.md.
Usage: wenqiao [OPTIONS] COMMAND [ARGS]...
Commands:
convert Convert academic Markdown to LaTeX/Markdown/HTML (default)
validate Validate citations, cross-references, and images
format Normalize academic Markdown formatting
convert (default):
Usage: wenqiao convert [OPTIONS] INPUT
Options:
-o, --output PATH Output file (stdout if omitted)
-t, --target [latex|markdown|html] Output format (default: latex)
--mode [full|body|fragment] Output scope (default: full)
--config PATH Config file (wenqiao.yaml)
--template PATH LaTeX template (.yaml)
--bib PATH Bibliography file (.bib)
--bibliography-mode MODE auto | standalone | external | none
--heading-id-style [attr|html] Heading anchor format
--locale [zh|en] Label language (default: zh)
--generate-figures Enable AI figure generation
--figures-config PATH Runner config (TOML)
--force-regenerate Re-generate existing images
--concurrency INTEGER Max concurrent figure generations (default: 4)
--strict Strict parsing mode
--verbose Verbose output
--dump-east Dump Enhanced AST as JSON
validate:
Usage: wenqiao validate [OPTIONS] INPUT
Options:
--bib PATH BibTeX file for citation validation
--config PATH External config file (wenqiao.yaml)
--template PATH LaTeX template file (.yaml)
--strict Exit 1 on any diagnostic warnings
--verbose Show all diagnostics
generate:
Usage: wenqiao generate [OPTIONS] INPUT
Generate AI figures in a .mid.md file concurrently.
Options:
--figures-config PATH TOML config for AI backend (API key, model, URL)
--model TEXT Override model name from config
--base-url TEXT Override API base URL
--api-key TEXT API key (also reads WENQIAO_API_KEY env var)
--type [openai] Backend type (default: openai)
--concurrency INTEGER Max concurrent generations, must be >= 1 (default: 4)
--start-id INTEGER Start figure index, 1-based inclusive (default: 1)
--end-id INTEGER End figure index, 1-based inclusive (default: last)
--force Re-generate even if output file exists
--no-writeback Skip writing <!-- ai-done: true --> to source file
format:
Usage: wenqiao format [OPTIONS] INPUT
Options:
-o, --output PATH Output path (default: overwrite input)
--check Check only, exit 1 if unformatted
--diff Show unified diff of changes
--no-rumdl Skip rumdl formatting step
--stats Show formatting statistics
Python API
Wenqiao exposes a clean Python API for programmatic use in build systems, Jupyter
notebooks, web services, and custom tooling. All public symbols are available
directly from the wenqiao package.
from wenqiao import convert, validate_text, format_text, parse_document
from wenqiao import ConvertResult, ConversionError, WenqiaoConfig, Diagnostic, Document
convert() — Convert Academic Markdown
The primary entry point. Converts Markdown source to LaTeX, HTML, or rich Markdown.
from wenqiao import convert
# Basic: string → LaTeX
result = convert("# Introduction\n\nHello world.\n")
print(result.text) # \documentclass[12pt,a4paper]{article} ...
print(result.config) # WenqiaoConfig(target='latex', mode='full', ...)
print(result.document) # Document(children=[Heading(...), Paragraph(...)])
print(result.diagnostics) # [] (empty if no warnings/errors)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
source |
str | Path |
required | Markdown text string or file path |
target |
str |
"latex" |
Output format: "latex" / "markdown" / "html" |
mode |
str | None |
None |
Output scope: "full" / "body" / "fragment" |
locale |
str | None |
None |
Label language: "zh" / "en" |
config |
WenqiaoConfig | dict | None |
None |
Pre-built config object or overrides dict |
template |
Path | None |
None |
Template YAML file path |
bib |
Path | str | dict | None |
None |
.bib file path, raw text, or pre-parsed dict |
strict |
bool |
False |
Raise ConversionError on diagnostic errors |
Returns: ConvertResult — a frozen dataclass with .text, .diagnostics, .config, .document.
Output targets
# LaTeX (default)
latex_result = convert(source)
# Rich Markdown with BibTeX footnotes
md_result = convert(source, target="markdown", bib=Path("refs.bib"))
# Self-contained HTML with MathJax
html_result = convert(source, target="html")
Output modes
# Full document with preamble (default)
full = convert(source, mode="full")
# Body only — no \documentclass or \begin{document}
body = convert(source, mode="body")
# Fragment — bare content, headings degraded
fragment = convert(source, mode="fragment")
File path input
from pathlib import Path
# Read directly from a .mid.md file
result = convert(Path("paper.mid.md"), target="html")
Configuration
Three ways to pass configuration:
from wenqiao import convert, WenqiaoConfig
from pathlib import Path
# 1. Dict overrides — merged with defaults
result = convert(source, config={
"documentclass": "report",
"classoptions": ["11pt", "letterpaper"],
"locale": "en",
})
# 2. Pre-built WenqiaoConfig — used as-is, no merging
cfg = WenqiaoConfig(mode="body", locale="en", documentclass="IEEEtran")
result = convert(source, config=cfg)
# 3. Template YAML file — merged at the template layer
result = convert(source, template=Path("templates/ieee.yaml"))
Bibliography
Three ways to provide bibliography data:
from pathlib import Path
# .bib file path
result = convert(md, target="markdown", bib=Path("refs.bib"))
# Raw .bib text content
bib_text = '@article{wang2024, author={Wang}, title={Test}, year={2024}}'
result = convert(md, target="markdown", bib=bib_text)
# Pre-parsed dict (cite_key → display string)
result = convert(md, target="markdown", bib={"wang2024": "Wang. Test. 2024."})
Strict mode
from wenqiao import convert, ConversionError
try:
result = convert(source, strict=True)
except ConversionError as e:
print(f"Conversion failed: {e}")
for diag in e.diagnostics:
print(f" {diag}")
validate_text() — Validate Document
Runs the EAST walker and validators to check citations, cross-references, and more.
Returns a list of Diagnostic objects.
from wenqiao import validate_text
# Basic validation
diagnostics = validate_text("See [ref](cite:missing_key).\n", bib={})
for d in diagnostics:
print(d) # [WARNING] <string> - Citation key 'missing_key' not found ...
# With .bib file
diagnostics = validate_text(Path("paper.mid.md"), bib=Path("refs.bib"))
# Strict mode — raises ConversionError on any errors
from wenqiao import ConversionError
try:
validate_text(source, strict=True)
except ConversionError as e:
print(f"Validation failed with {len(e.diagnostics)} issues")
format_text() — Normalize Formatting
Round-trip normalization: parse → render back as Markdown. Idempotent — formatting an already-formatted document returns the same text.
Built-in normalization includes common math cleanup used by wenqiao format,
including:
- Unicode math operators to LaTeX (
≤→$\\leq$, in math spans to\\leq) - Bare Greek letters to LaTeX (
σ→$\\sigma$, in math spans to\\sigma) - Unicode super/subscripts (
m²→$m^2$,x₀→$x_0$) - Blank-line separation around display-math blocks delimited by standalone
$$
from wenqiao import format_text
formatted = format_text("# Hello\n\nWorld.\n")
print(formatted)
# Works with file paths too
formatted = format_text(Path("paper.mid.md"))
# Idempotent check
assert format_text(formatted) == formatted
parse_document() — Low-level EAST Access
Returns the raw EAST Document tree for custom processing. Runs parse +
comment directive processing but no rendering.
from wenqiao import parse_document, Document
from wenqiao.nodes import Heading, Paragraph
doc = parse_document("# Hello\n\nWorld.\n")
assert isinstance(doc, Document)
# Inspect the tree
for child in doc.children:
print(f"{child.type}: {child}")
# Access document-level metadata from directives
doc = parse_document("""
<!-- title: My Paper -->
<!-- author: Author -->
# Introduction
""")
print(doc.metadata) # {'title': 'My Paper', 'author': 'Author'}
ConvertResult — Result Object
@dataclass(frozen=True)
class ConvertResult:
text: str # Rendered output string
diagnostics: list[Diagnostic] # Warnings and errors
config: WenqiaoConfig # Resolved configuration
document: Document # EAST tree (for inspection)
ConversionError — Error Type
Raised when strict=True and diagnostics contain errors.
class ConversionError(Exception):
diagnostics: list[Diagnostic] # All diagnostic messages
Integration Examples
Jupyter Notebook
from wenqiao import convert
from IPython.display import HTML
source = Path("paper.mid.md").read_text()
result = convert(source, target="html", mode="body")
HTML(result.text)
Build system (Makefile / script)
#!/usr/bin/env python3
"""Batch convert all .mid.md files to LaTeX."""
from pathlib import Path
from wenqiao import convert
for md_file in Path("chapters/").glob("*.mid.md"):
result = convert(md_file, template=Path("templates/ieee.yaml"))
out = md_file.with_suffix(".tex")
out.write_text(result.text, encoding="utf-8")
print(f"{md_file} → {out} ({len(result.diagnostics)} diagnostics)")
Web service (FastAPI)
from fastapi import FastAPI, HTTPException
from wenqiao import convert, ConversionError
app = FastAPI()
@app.post("/convert")
def convert_markdown(source: str, target: str = "latex"):
try:
result = convert(source, target=target, strict=True)
return {"text": result.text, "diagnostics": [str(d) for d in result.diagnostics]}
except ConversionError as e:
raise HTTPException(400, detail=[str(d) for d in e.diagnostics])
Custom EAST processing
from wenqiao import parse_document
from wenqiao.nodes import Heading, Citation
doc = parse_document(Path("paper.mid.md"))
# Extract all headings
headings = [
(child.level, child)
for child in doc.children
if isinstance(child, Heading)
]
# Collect all citation keys
def collect_cites(node, keys=None):
if keys is None:
keys = set()
if isinstance(node, Citation):
keys.update(node.keys)
for child in node.children:
collect_cites(child, keys)
return keys
all_keys = collect_cites(doc)
print(f"Found {len(all_keys)} unique citation keys")
Document Format
Wenqiao documents are standard Markdown files with the .mid.md extension. All academic
metadata is encoded in HTML comments (<!-- key: value -->), so the source is readable
in any Markdown viewer while carrying full LaTeX semantics.
Document-level Directives
These go at the top of your .mid.md file and control the LaTeX preamble:
<!-- documentclass: article -->
<!-- classoptions: [12pt, a4paper] -->
<!-- packages: [amsmath, graphicx, hyperref] -->
<!-- bibliography: refs.bib -->
<!-- bibstyle: IEEEtran -->
<!-- title: My Paper Title -->
<!-- author: Author Name -->
<!-- date: 2026 -->
<!-- abstract: |
This paper presents a novel method ...
-->
package-options passes options to individual packages:
<!-- packages: [amsmath, graphicx, geometry] -->
<!-- package-options: {geometry: "margin=1in,top=2cm"} -->
This generates \usepackage[margin=1in,top=2cm]{geometry}. The value is passed verbatim into \usepackage[...]{pkg}.
Generated LaTeX preamble
\documentclass[12pt,a4paper]{article}
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage{hyperref}
\bibliographystyle{IEEEtran}
\title{My Paper Title}
\author{Author Name}
\date{2026}
\begin{document}
\maketitle
\begin{abstract}
This paper presents a novel method ...
\end{abstract}
% ... body content ...
\bibliography{refs}
\end{document}
Citations
Use Markdown link syntax with a cite: prefix in the URL:
Prior work [Wang et al.](cite:wang2024) showed that ...
Classical methods [1](citep:fischler1981) have limitations.
As [Smith](citeauthor:smith2023) demonstrated ...
Bare citation shortcuts are also supported:
[cite:wang2024]
[cite:wang2024?cmd=citet]
[cite:a,b,c]
They behave like empty-display citations and map to the same citation commands.
| wenqiao Syntax | LaTeX Output | HTML Output |
|---|---|---|
[text](cite:key) |
\cite{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](citep:key) |
\citep{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](citet:key) |
\citet{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](citeauthor:key) |
\citeauthor{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](citeyear:key) |
\citeyear{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](textcite:key) |
\textcite{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](parencite:key) |
\parencite{key} |
<sup><a href="#cite-key">[1]</a></sup> |
[text](autocite:key) |
\autocite{key} |
<sup><a href="#cite-key">[1]</a></sup> |
Cross-references
# Introduction
<!-- label: sec:intro -->
See [Section 1](ref:sec:intro) for details.
[ref:sec:intro]
Bare ref shortcuts use the label itself as display text in Markdown/HTML output.
| Target | Output |
|---|---|
| LaTeX | \label{sec:intro} + \ref{sec:intro} |
| HTML | <h1 id="sec:intro"> + <a href="#sec:intro"> |
| Markdown | {#sec:intro} + <a href="#sec:intro"> |
Figures with Metadata

<!-- caption: Point cloud registration pipeline -->
<!-- label: fig:pipeline -->
<!-- width: 0.85\textwidth -->
<!-- placement: htbp -->
Generated LaTeX figure
\begin{figure}[htbp]
\centering
\includegraphics[width=0.85\textwidth]{figures/pipeline.png}
\caption{Point cloud registration pipeline}
\label{fig:pipeline}
\end{figure}
Generated HTML figure
<figure id="fig:pipeline">
<img src="figures/pipeline.png"
alt="Pipeline overview"
loading="lazy">
<figcaption>Figure 1: Point cloud registration pipeline</figcaption>
</figure>
Generated rich Markdown figure
<figure id="fig:pipeline">
<img src="figures/pipeline.png"
alt="Pipeline overview"
style="max-width:100%">
<figcaption><strong>Figure 1</strong>: Point cloud registration pipeline</figcaption>
</figure>
AI-generated Figures
Mark a figure as AI-generated to include provenance metadata in the output:

<!-- caption: Method taxonomy -->
<!-- label: fig:taxonomy -->
<!-- ai-generated: true -->
<!-- ai-model: dall-e-3 -->
<!-- ai-prompt: |
Academic diagram showing method taxonomy,
clean minimal style, white background
-->
<!-- ai-negative-prompt: photorealistic, 3D -->
In LaTeX output, AI metadata becomes % comments. In HTML and rich Markdown, it renders
as a collapsible <details> block.
Use --generate-figures to automatically generate images from prompts:
wenqiao paper.mid.md -o paper.tex \
--generate-figures \
--figures-config api.toml
Tables
| Method | RMSE (cm) | Time (ms) | Platform |
|--------|-----------|-----------|----------|
| RANSAC | 2.3 | 150 | CPU |
| Ours | 1.9 | 8 | FPGA |
<!-- caption: Performance comparison on ModelNet40 -->
<!-- label: tab:results -->
Generated LaTeX table
\begin{table}[htbp]
\centering
\caption{Performance comparison on ModelNet40}
\label{tab:results}
\begin{tabular}{llll}
\hline
Method & RMSE (cm) & Time (ms) & Platform \\
\hline
RANSAC & 2.3 & 150 & CPU \\
Ours & 1.9 & 8 & FPGA \\
\hline
\end{tabular}
\end{table}
Complex tables (merged cells, booktabs, multicolumn) use a raw LaTeX passthrough block:
<!-- begin: raw -->
\begin{table}[htbp]
\centering
\caption{Multi-column results}
\label{tab:complex}
\begin{tabular}{lcc}
\hline
\multicolumn{2}{c}{Performance} & Score \\
\hline
ICP & 85.3 & RMSE \\
Ours & 93.1 & RMSE \\
\hline
\end{tabular}
\end{table}
<!-- end: raw -->
Raw passthrough also preserves math delimiters verbatim, including inline $...$
and display $$...$$ spans inside the raw block.
Add booktabs to your packages list to use \toprule, \midrule, \bottomrule.
Math
Inline: the transform $T \in SE(3)$ is defined by ...
Display with label:
$$
T = \begin{bmatrix} R & t \\ 0 & 1 \end{bmatrix}
$$
<!-- label: eq:transform -->
| Target | Inline | Display |
|---|---|---|
| LaTeX | $T \in SE(3)$ |
\begin{equation} ... \label{eq:transform} \end{equation} |
| HTML | $T \in SE(3)$ (MathJax) |
\[ ... \] with id="eq:transform" |
| Markdown | $T \in SE(3)$ |
$$ ... $$ with <a id="eq:transform"> |
Environments
<!-- begin: algorithm -->
**Input:** Point clouds $P$ and $Q$
1. Compute coplanar bases
2. Find congruent sets
3. Verify and refine
**Output:** Rigid transform $T$
<!-- end: algorithm -->
Generated LaTeX environment
\begin{algorithm}
\textbf{Input:} Point clouds $P$ and $Q$
\begin{enumerate}
\item Compute coplanar bases
\item Find congruent sets
\item Verify and refine
\end{enumerate}
\textbf{Output:} Rigid transform $T$
\end{algorithm}
Include TeX
Insert external LaTeX fragments (e.g., complex TikZ diagrams):
<!-- include-tex: figures/architecture.tex -->
This reads the file and inserts it as a RawBlock node. Works inside environments too.
Full Example
See tests/fixtures/full_example.mid.md for a
complete demonstration of all features.
Configuration
flowchart LR
CLI["CLI flags"] --> R["Resolver"]
DIR["In-document<br/>directives"] --> R
CFG["Config file<br/><code>wenqiao.yaml</code>"] --> R
TPL["Template<br/><code>ieee.yaml</code>"] --> R
DEF["Built-in<br/>defaults"] --> R
R --> OUT["Final Config"]
style CLI fill:#ffa
style DIR fill:#fda
style CFG fill:#fca
style TPL fill:#faa
style DEF fill:#eee
Priority: CLI > directives > config file > template > defaults. Higher layers override lower layers. This lets you set venue defaults in a template, override per-paper in the config file, and fine-tune per-build on the command line.
Config File (wenqiao.yaml)
documentclass: article
classoptions: [12pt, a4paper]
packages: [amsmath, graphicx]
code_style: lstlisting # or: minted
locale: zh # or: en
target: latex # or: markdown, html
bibliography_mode: auto # or: standalone, external, none
heading_id_style: attr # or: html
extra-preamble: |
\DeclareMathOperator{\argmin}{argmin}
All config fields
| Field | Type | Default | Description |
|---|---|---|---|
documentclass |
str |
"article" |
LaTeX document class |
classoptions |
list[str] |
[] |
Class options like 12pt, a4paper |
packages |
list[str] |
[] |
LaTeX packages to load |
title |
str |
"" |
Document title |
author |
str |
"" |
Author name(s) |
date |
str |
"" |
Date string |
abstract |
str |
"" |
Abstract text |
bibliography |
str |
"" |
BibTeX file path |
bibstyle |
str |
"plain" |
Bibliography style |
code_style |
str |
"lstlisting" |
Code block rendering style |
locale |
str |
"zh" |
Label language |
target |
str |
"latex" |
Default output target |
bibliography_mode |
str |
"auto" |
Bibliography output strategy |
heading_id_style |
str |
"attr" |
Heading anchor format |
extra-preamble |
str |
"" |
Raw LaTeX for preamble |
thematic_break |
str |
"newpage" |
newpage / hrule / ignore |
ref_tilde |
bool |
true |
Use ~\ref instead of \ref |
Template File
Templates provide reusable defaults for specific venues. Example — IEEE conference:
# templates/ieee.yaml
documentclass: IEEEtran
classoptions: [conference]
packages:
- amsmath
- graphicx
- cite
extra-preamble: |
\IEEEoverridecommandlockouts
bibstyle: IEEEtran
wenqiao paper.mid.md --template templates/ieee.yaml -o paper.tex
Project Structure
wenqiao/
├── src/wenqiao/ # Source code (17 modules)
│ ├── __init__.py # Public API re-exports
│ ├── api.py # Public Python API (convert, validate, format)
│ ├── cli.py # Click CLI entry point
│ ├── parser.py # Markdown → EAST parser
│ ├── nodes.py # EAST node definitions (32 types)
│ ├── comment.py # 4-phase comment directive processor
│ ├── config.py # 5-layer configuration resolution
│ ├── latex.py # LaTeX renderer
│ ├── markdown.py # Rich Markdown renderer (2-pass)
│ ├── html.py # HTML renderer (MathJax CDN)
│ ├── bibtex.py # Minimal BibTeX parser
│ ├── genfig.py # AI figure generation pipeline
│ ├── escape.py # LaTeX special character escaping
│ ├── sanitize.py # HTML input sanitization
│ ├── url_check.py # URL safety validation
│ ├── ai_meta.py # Shared AI metadata rendering
│ └── diagnostic.py # Error/warning diagnostics
├── tests/ # Test suite (17 files, 479 tests)
│ ├── fixtures/ # Test .mid.md documents
│ └── conftest.py # Shared pytest fixtures
├── templates/ # LaTeX venue templates (ieee.yaml, ...)
├── docs/ # Documentation and plans
├── pyproject.toml # Project metadata & tool config
├── Makefile # Build commands
└── CLAUDE.md # AI agent coding standards
Comment Processor 4-phase Pipeline
flowchart TD
A["Phase 1: Document Directives<br/><i>documentclass, packages, title, ...</i>"]
B["Phase 2: Begin/End Environments<br/><i>algorithm, theorem, proof, ...</i>"]
C["Phase 3: Include-TeX<br/><i>insert external .tex fragments</i>"]
D["Phase 4: Attach-Up Directives<br/><i>caption, label, width, placement, ai-*</i>"]
A --> B --> C --> D
- Phase 1 extracts top-level metadata (documentclass, packages, title, author, etc.)
- Phase 2 pairs
<!-- begin: X -->/<!-- end: X -->intoEnvironmentnodes - Phase 3 replaces
<!-- include-tex: file.tex -->withRawBlockcontent (recursive) - Phase 4 attaches trailing comment metadata to the preceding figure/table/math node
Development
Setup
uv sync # Install all dependencies
Commands
| Command | Description |
|---|---|
make check |
Run lint + typecheck + test (required before committing) |
make test |
Run pytest with verbose output |
make lint |
Run ruff linter |
make format |
Run ruff formatter |
make typecheck |
Run mypy in strict mode |
make fix |
Auto-fix lint issues and format |
Coding Standards
| Rule | Example |
|---|---|
| Type annotations on all functions | def parse(text: str) -> Document: |
| Bilingual comments (EN + CN) | # Calculate average (计算平均值) |
| Google-style docstrings (bilingual) | See CLAUDE.md |
| 100 char max line length | Enforced by ruff |
snake_case functions, PascalCase classes |
render_figure(), LaTeXRenderer |
Docstring example
def render_figure(self, node: Node) -> str:
"""Render a Figure node as LaTeX figure environment.
将 Figure 节点渲染为 LaTeX figure 环境。
Args:
node: Figure node to render (待渲染的 Figure 节点)
Returns:
LaTeX figure environment string (LaTeX figure 环境字符串)
"""
Testing
Tests mirror source modules one-to-one (parser.py → test_parser.py).
make test # Run all 625 tests
| Test file | Covers |
|---|---|
test_api.py |
Public Python API (convert, validate, format, parse) |
test_parser.py |
Markdown parsing, node creation |
test_nodes.py |
EAST serialization, type properties |
test_latex.py |
LaTeX rendering (headings, math, citations, tables, figures, scaling) |
test_markdown.py |
Rich Markdown rendering, index pass |
test_html.py |
HTML rendering, sanitization, MathJax |
test_comment.py |
4-phase comment directive processing |
test_config.py |
Config loading, precedence, validation |
test_cli.py |
CLI options, error handling |
test_e2e.py |
End-to-end conversion pipelines |
test_bibtex.py |
BibTeX file parsing |
test_genfig.py |
AI figure generation jobs |
test_escape.py |
LaTeX special character escaping |
test_sanitize.py |
HTML input sanitization |
test_url_check.py |
URL safety validation |
test_diagnostic.py |
Diagnostic error/warning collection |
Test fixtures in tests/fixtures/ provide reusable .mid.md
documents: minimal, heading_para, math, cite_ref, comments, full_example.
Claude Code Skill
This project ships a Claude Code skill
(wenqiao-writer) that teaches Claude how to write well-formed .mid.md documents.
Setup
Symlink the skill into your Claude Code configuration:
# From the project root
ln -s "$(pwd)/skills/wenqiao-writer" ~/.claude/skills/wenqiao-writer
Or, if you have the repo cloned elsewhere, the project already includes a symlink at
.claude/skills/wenqiao-writer pointing to skills/wenqiao-writer.
Usage
Once installed, invoke the skill in Claude Code by name:
/wenqiao-writer
Or simply ask Claude to "write a .mid.md paper" — it will automatically pick up the
skill. The skill teaches Claude:
- All
.mid.mddirectives (document headers, labels, captions, environments, etc.) - Correct citation syntax (
[text](cite:key)) and cross-references ([text](ref:label)) - AI figure metadata directives
- Common mistakes to avoid
- A full feature coverage checklist for test fixtures
Example prompt
Write a
.mid.mddraft for a paper about point cloud registration using FPGA acceleration. Include an abstract, 3 sections, a comparison table, and 2 figures with AI generation prompts.
Built-in Presets
Presets provide a one-line starting configuration for common document types:
<!-- preset: zh -->
<!-- title: 我的论文 -->
| Preset | documentclass |
locale |
Use case |
|---|---|---|---|
zh |
ctexart |
zh |
Chinese academic paper — compile with XeLaTeX |
en |
article |
en |
Standard English paper |
Both presets include a comprehensive package set covering all wenqiao features:
amsmath, amssymb, graphicx, geometry (2 cm margins), hyperref, xcolor, listings,
amsthm, algorithm2e, booktabs, makecell (multi-line table cells).
Add <!-- package-options: {...} --> to configure individual packages, or add more via <!-- packages: [...] -->.
All document directives override the preset:
<!-- preset: zh -->
<!-- documentclass: IEEEtran --> <!-- overrides ctexart -->
Via CLI:
wenqiao paper.mid.md --preset zh -o paper.tex
Priority chain: CLI > directives > config file > template > preset > defaults
Contributing
- Fork the repository
- Create a feature branch
- Write tests first (TDD encouraged)
- Ensure
make checkpasses (ruff, mypy, pytest) - Submit a pull request
All code must include complete type annotations and bilingual (EN + CN) comments. See CLAUDE.md for the full coding standards.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wenqiao-0.1.2.tar.gz.
File metadata
- Download URL: wenqiao-0.1.2.tar.gz
- Upload date:
- Size: 190.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e087f5214d12b3b3c1115684cf99a96f637eca730df90982906e4f9f3eb6032
|
|
| MD5 |
1a59925d8fb9c4c4f96a1ea84de0718f
|
|
| BLAKE2b-256 |
800a9ef1558a2cab6960bf3f49c2daa74ca12057f01249c755bcd11f9b9d4d46
|
Provenance
The following attestation bundles were made for wenqiao-0.1.2.tar.gz:
Publisher:
publish.yml on nerdneilsfield/wenqiao
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wenqiao-0.1.2.tar.gz -
Subject digest:
3e087f5214d12b3b3c1115684cf99a96f637eca730df90982906e4f9f3eb6032 - Sigstore transparency entry: 1076586235
- Sigstore integration time:
-
Permalink:
nerdneilsfield/wenqiao@2bea54c5da18b3c4e43bb1190cf7a9974c678119 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/nerdneilsfield
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2bea54c5da18b3c4e43bb1190cf7a9974c678119 -
Trigger Event:
push
-
Statement type:
File details
Details for the file wenqiao-0.1.2-py3-none-any.whl.
File metadata
- Download URL: wenqiao-0.1.2-py3-none-any.whl
- Upload date:
- Size: 118.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
709e5d625303f750eaa0e06e140437b04afa3b05e5a7b8bce4b6436a7773bf48
|
|
| MD5 |
aae8a1c8945973f561a653533eea37e5
|
|
| BLAKE2b-256 |
c52315b12fdf6983711d287886e266640eb9f7e26e4bcbc87ba3d0d2c64a586f
|
Provenance
The following attestation bundles were made for wenqiao-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on nerdneilsfield/wenqiao
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wenqiao-0.1.2-py3-none-any.whl -
Subject digest:
709e5d625303f750eaa0e06e140437b04afa3b05e5a7b8bce4b6436a7773bf48 - Sigstore transparency entry: 1076586240
- Sigstore integration time:
-
Permalink:
nerdneilsfield/wenqiao@2bea54c5da18b3c4e43bb1190cf7a9974c678119 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/nerdneilsfield
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2bea54c5da18b3c4e43bb1190cf7a9974c678119 -
Trigger Event:
push
-
Statement type: