Data-bound Markdown-to-Word builder for scientific papers

These details have not been verified by PyPI

Project links

Repository

Project description

vibepaper

Build scientific papers from Markdown where every number traces back to the analysis that produced it.

The problem

Numbers in scientific prose go stale. You finish the analysis, write up the results, and six months later a reviewer asks you to rerun with a corrected dataset. Now you have updated CSVs and a paper full of hardcoded figures: "the mean increased from 4.6 to 9.2", "of the 7,318 variants that lost HIGH impact". Finding every number, checking which is still current, updating without introducing new errors — this is tedious, error-prone, and nearly impossible to review.

The solution

vibepaper separates computation from communication. Analysis scripts write their key results to named CSV files. The paper references those values by name using Jinja2 template syntax. The build pipeline substitutes every reference before passing the document to pandoc for final Word output.

Mean transcripts per variant doubled from
{{ vep_impact.giab_mean_v112 | dp(1) }} to
{{ vep_impact.giab_mean_v115_full | dp(1) }} on upgrading to Ensembl v115.

When you rerun the analysis, you rerun the build. The numbers update everywhere, simultaneously, with a loud error if any reference is missing.

Three design principles:

Templates express intent; scripts express computation. No arithmetic in templates. If you need a percentage increase, the analysis script computes and writes it. The template formats it.
Loud failures over silent omissions. A missing or renamed CSV column is a build error, not an empty string in the output.
Every number is traceable. Any figure in the rendered paper can be grepped back to the template reference and the script that wrote the CSV.

Installation

pip install vibepaper

Requires pandoc as a system dependency:

# macOS
brew install pandoc

# Debian/Ubuntu
sudo apt-get install pandoc

Quick start

Option 1 — paper.toml (recommended for full papers)

Create paper.toml in your project root:

[paper]
sections = [
    "paper/abstract.md",
    "paper/introduction.md",
    "paper/methods.md",
    "paper/results.md",
    "paper/discussion.md",
    "paper/references.md",
    "paper/figures.md",
]
supplementary = ["paper/supplementary.md"]
name = "my_paper"

Then build:

vibepaper
# outputs: output/my_paper_2025-06-01.docx
#          output/my_paper_supplementary_2025-06-01.docx

Option 2 — sections file

Create a plain text file listing your sections in order:

# order.txt
paper/abstract.md
paper/methods.md
paper/results.md
paper/discussion.md

Then:

vibepaper --sections-file order.txt --name my_paper

Lines starting with # and blank lines are ignored. Paths are relative to the sections file's location.

Option 3 — direct file list

vibepaper paper/abstract.md paper/results.md paper/discussion.md --name my_paper

Template syntax

vibepaper uses Jinja2 for template substitution. References follow the pattern {{ namespace.field | filter }}.

Number formatting filters

Filter	Example	Output
`\| commas`	`{{ n \| commas }}`	`254,129`
`\| dp(n)`	`{{ mean \| dp(1) }}`	`9.2`
`\| pct(n)`	`{{ rate \| pct(1) }}`	`52.2%`
`\| fold(n)`	`{{ ratio \| fold(1) }}`	`2.0-fold`
`\| fmt(spec)`	`{{ v \| fmt('+.1f') }}`	`+3.7`

dp (decimal places) is for numbers that will have surrounding text (e.g. "mean TPV was 9.2"). pct appends the % sign. Use fmt for any format string Python's format() accepts.

Examples

Of {{ clinvar.total_variants | commas }} ClinVar variants, {{ clinvar.gained_high_count | commas }}
({{ clinvar.gained_high_pct | dp(2) }}%) gained HIGH impact on upgrading to v115.

Mean transcripts per variant increased {{ vep.mean_fold | fold }} from
{{ vep.mean_v112 | dp(1) }} to {{ vep.mean_v115 | dp(1) }}.

Data sources

Facts CSVs (primary)

The main data binding mechanism. Analysis scripts write 1-row CSVs to output/facts/. The filename stem becomes the template namespace; column names become field names.

output/facts/
    transcript_growth.csv      → {{ transcript_growth.v112_count | commas }}
    vep_impact.csv             → {{ vep_impact.giab_mean_v115_full | dp(1) }}
    clinvar_reclassification.csv → {{ clinvar_reclassification.total_variants | commas }}

A CSV named transcript_growth.csv with columns v112_count, v115_count:

v112_count,v115_count
254129,509650

Is referenced as:

Transcripts grew from {{ transcript_growth.v112_count | commas }}
to {{ transcript_growth.v115_count | commas }}.

vibepaper raises a hard error if a referenced column doesn't exist. It warns if the rendered output contains literal nan, None, or unresolved {{.

JSON data (supplemental)

Pass additional values directly without creating a CSV file:

# Inline dict
vibepaper --data '{"cohort_size": 412, "stats": {"pvalue": 0.003}}'

# From file
vibepaper --data results.json

Top-level keys become namespaces:

Cohort: {{ cohort_size }} participants (p = {{ stats.pvalue | dp(3) }}).

JSON is merged on top of facts CSVs. Nested dicts are deep-merged at the namespace level; scalar values override directly.

Table directives

For supplementary tables, embed CSVs directly into the Markdown with a directive comment:

<!-- include-csv: output/consequence_changes.csv
  columns: [consequence, v112_count, v115_count, pct_change]
  rename:
    v112_count: v112
    v115_count: v115
    pct_change: Change (%)
  format:
    v112_count: ",d"
    v115_count: ",d"
    pct_change: ".1f"
  sort: [-pct_change]
  max_rows: 20
-->

Directive options:

Option	Description
`columns`	List of columns to include, in order
`rename`	Dict mapping column names to display names
`format`	Dict mapping column names to Python format specs
`align`	`left`, `right`, `center`, or per-column dict
`sort`	List of column names; prefix `-` for descending
`filter`	pandas `query()` expression
`max_rows`	Truncate to this many rows
`na_rep`	String to use for missing values (default: `—`)

paper.toml reference

[paper]
# Manuscript sections in order (paths relative to paper.toml)
sections = [
    "paper/title.md",
    "paper/abstract.md",
    "paper/introduction.md",
    "paper/methods.md",
    "paper/results.md",
    "paper/discussion.md",
    "paper/references.md",
    "paper/figures.md",
]

# Built as a separate .docx unless --combined is passed
supplementary = ["paper/supplementary.md"]

# Output filename stem: {name}_{date}.docx
# Default: parent directory name
name = "my_paper"

# Directory of 1-row facts CSVs
# Default: "output/facts"
facts_dir = "output/facts"

# Output directory for .docx files
# Default: "output"
output_dir = "output"

# Intermediate build directory
# Default: "build"
build_dir = "build"

# Word reference document for custom formatting (double spacing, line numbers, etc.)
# Only used if the file exists; silently skipped otherwise.
# Default: "paper/reference.docx"
reference_doc = "paper/reference.docx"

Word reference document

To apply journal-specific formatting (e.g. double line spacing, continuous line numbering):

Open a blank Word document
Set paragraph spacing to Double and enable Layout → Line Numbers → Continuous
Save as paper/reference.docx

vibepaper will use it automatically if it exists at the configured path.

CLI reference

vibepaper [FILE.md ...] [options]

Input (choose one):
  FILE.md ...           Markdown files in order (no paper.toml needed)
  --sections-file FILE  Plain text file with one .md path per line
  --config FILE         paper.toml config file (default: paper.toml)

Data:
  --data JSON           JSON file path or inline dict for template context
  --facts-dir DIR       Override facts CSV directory

Output:
  --output-dir DIR      Output directory for .docx files
  --name NAME           Output filename stem
  --combined            Merge supplementary into main document

Project layout convention

my_paper/
├── paper.toml
├── paper/
│   ├── abstract.md
│   ├── introduction.md
│   ├── methods.md
│   ├── results.md
│   ├── discussion.md
│   ├── references.md
│   ├── figures.md
│   ├── supplementary.md
│   └── reference.docx        ← optional Word formatting template
├── output/
│   ├── facts/
│   │   ├── cohort.csv        ← 1-row: n_patients, n_controls, ...
│   │   ├── model_results.csv ← 1-row: auc, pvalue, effect_size, ...
│   │   └── ...
│   └── tables/
│       └── full_results.csv  ← multi-row: used in include-csv directives
└── scripts/
    └── run_analysis.py       ← writes to output/facts/

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

0.7.1

Apr 5, 2026

0.7.0

Apr 4, 2026

0.6.0

Apr 3, 2026

0.5.0

Apr 3, 2026

0.4.0

Mar 27, 2026

0.3.0

Mar 27, 2026

0.2.0

Mar 27, 2026

0.1.1

Mar 27, 2026

This version

0.1.0

Mar 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vibepaper-0.1.0.tar.gz (75.5 kB view details)

Uploaded Mar 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vibepaper-0.1.0-py3-none-any.whl (15.0 kB view details)

Uploaded Mar 27, 2026 Python 3

File details

Details for the file vibepaper-0.1.0.tar.gz.

File metadata

Download URL: vibepaper-0.1.0.tar.gz
Upload date: Mar 27, 2026
Size: 75.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vibepaper-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`44a7b96e5c4641a8f607c3bb2bb461eb5ebb1edde454508d32399546da7d5234`
MD5	`2d6f6e036e858b06b8ca450c93e267d2`
BLAKE2b-256	`91303aa47c3d0ac525ea67d7df045f00e0e1ba778a4a7b3943be5d979c3117d0`

See more details on using hashes here.

File details

Details for the file vibepaper-0.1.0-py3-none-any.whl.

File metadata

Download URL: vibepaper-0.1.0-py3-none-any.whl
Upload date: Mar 27, 2026
Size: 15.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vibepaper-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`731a1fefcfd324d564f1dc16326217fa2b3065d7351dbe01633f0fd061121a74`
MD5	`7b8e6efe0c2702268598eb38ed34a07f`
BLAKE2b-256	`331123f60cd1e493f71327ed76cee6b25cdf043dd94be17e5110b1436daff5c3`

See more details on using hashes here.

Filter	Example	Output
`\| commas`	`{{ n \| commas }}`	`254,129`
`\| dp(n)`	`{{ mean \| dp(1) }}`	`9.2`
`\| pct(n)`	`{{ rate \| pct(1) }}`	`52.2%`
`\| fold(n)`	`{{ ratio \| fold(1) }}`	`2.0-fold`
`\| fmt(spec)`	`{{ v \| fmt('+.1f') }}`	`+3.7`

vibepaper 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

vibepaper

The problem

The solution

Installation

Quick start

Option 1 — paper.toml (recommended for full papers)

Option 2 — sections file

Option 3 — direct file list

Template syntax

Number formatting filters

Examples

Data sources

Facts CSVs (primary)

JSON data (supplemental)

Table directives

paper.toml reference

Word reference document

CLI reference

Project layout convention

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes