static site generator built on pandoc + jinja2
Project description
pdj_sitegen
Pandoc and Jinja Site Generator
- docs:
miv.name/pdj_sitegen/ - demo site:
miv.name/pdj_sitegen/demo_site/ - source:
github.com/mivanit/pdj-sitegen
Installation:
pip install pdj-sitegen
you should either have Pandoc installed, or you can run
python -m pdj_sitegen.install_pandoc
which will install pandoc using pypandoc
Usage
Quick Start
Scaffold a new site with all default files:
python -m pdj_sitegen.setup_site [directory]
This creates:
config.yml- default configurationtemplates/default.html.jinja2- default HTML templatecontent/index.md- sample index pagecontent/resources/style.css- basic stylesheetcontent/resources/syntax.css- code syntax highlighting
Manual Setup
- create a config file. For an example, see
pdj_sitegen.config.DEFAULT_CONFIG_YAML, or print a copy of it via
python -m pdj_sitegen.config
- adjust the config file to your needs. most importantly:
# directory with markdown content files and resources, relative to cwd
content_dir: content
# templates directory, relative to cwd
templates_dir: templates
# default template file, relative to `templates_dir`
default_template: default.html.jinja2
# output directory, relative to cwd
output_dir: docs
-
populate the
contentdirectory with markdown files and resources (images, css, etc.), and adjust templates in thetemplatesdirectory. See the demo site for usage examples. -
run the generator
python -m pdj_sitegen your_config.yaml
CLI Arguments
python -m pdj_sitegen your_config.yaml [-q] [-s]
-q, --quiet: Disable verbose output (suppress progress messages)-s, --smart-rebuild: Only rebuild files modified since last build
Smart Rebuild
The smart rebuild feature (-s flag) enables incremental builds by tracking file modification times:
- A
.build_timefile in your project root stores the timestamp of the last successful build - Source files are compared against this timestamp; only newer files are rebuilt
- Ideal for large sites during development - significantly speeds up iteration
# Full rebuild (always safe)
python -m pdj_sitegen config.yml
# Smart rebuild (faster, for content-only changes)
python -m pdj_sitegen config.yml -s
When to use full rebuild: After modifying templates or config, since these changes affect all pages. The .build_time file is automatically created and updated.
Configuration
Config File Formats
pdj-sitegen supports multiple configuration file formats:
- YAML (
.yml,.yaml) - recommended, human-friendly - TOML (
.toml) - also supported - JSON (
.json) - for programmatic generation
python -m pdj_sitegen config.yml # YAML
python -m pdj_sitegen config.toml # TOML
python -m pdj_sitegen config.json # JSON
Complete Configuration Examples
YAML Configuration
content_dir: content
templates_dir: templates
default_template: default.html.jinja2
output_dir: docs
copy_include: []
copy_exclude:
- "*.md"
prettify: false
pandoc_fmt_from: markdown+smart
pandoc_fmt_to: html
__pandoc__:
mathjax: true
toc: true
jinja_env_kwargs: {}
globals_:
site_name: "My Site"
author: "Your Name"
TOML Configuration
content_dir = "content"
templates_dir = "templates"
default_template = "default.html.jinja2"
output_dir = "docs"
copy_include = []
copy_exclude = ["*.md"]
prettify = false
pandoc_fmt_from = "markdown+smart"
pandoc_fmt_to = "html"
[__pandoc__]
mathjax = true
toc = true
[jinja_env_kwargs]
[globals_]
site_name = "My Site"
author = "Your Name"
JSON Configuration
{
"content_dir": "content",
"templates_dir": "templates",
"default_template": "default.html.jinja2",
"output_dir": "docs",
"copy_include": [],
"copy_exclude": ["*.md"],
"prettify": false,
"pandoc_fmt_from": "markdown+smart",
"pandoc_fmt_to": "html",
"__pandoc__": {
"mathjax": true,
"toc": true
},
"jinja_env_kwargs": {},
"globals_": {
"site_name": "My Site",
"author": "Your Name"
}
}
Content Mirroring
Files from content_dir are automatically copied to output_dir, excluding markdown files (which are processed into HTML). Control this with copy_include and copy_exclude:
# Default: copy everything except .md files
copy_include: []
copy_exclude:
- "*.md"
# Also exclude temp files and .git
copy_exclude:
- "*.md"
- "*.tmp"
- ".git*"
# Copy only specific file types
copy_include:
- "*.css"
- "*.js"
- "*.png"
- "*.jpg"
copy_exclude: []
# Force copy .md files too (include wins over exclude)
copy_include:
- "*.md"
copy_exclude:
- "*.md"
Additional Options
# Global template variables accessible in all templates
globals_:
site_name: "My Site"
author: "Your Name"
# Directory to save intermediate processing files (for debugging)
intermediates_dir: null # or "_intermediates"
# Prettify HTML output (uses BeautifulSoup)
prettify: false
# Pandoc format settings
pandoc_fmt_from: "markdown+smart"
pandoc_fmt_to: "html"
# Global Pandoc options (can be overridden per-file in frontmatter)
__pandoc__:
mathjax: true
# Jinja2 environment customization
jinja_env_kwargs: {}
Debugging with Intermediates
Setting intermediates_dir saves intermediate processing stages for debugging template and Pandoc issues:
intermediates_dir: _intermediates
This creates the following structure:
_intermediates/
frontmatter_txt/ # Raw frontmatter as parsed
frontmatter_json/ # Frontmatter as JSON (for inspection)
md/ # Rendered Markdown (after Jinja2, before Pandoc)
html/ # Pandoc output (before template wrapping)
Useful for debugging Jinja2 template rendering in content, inspecting what Pandoc receives vs. outputs, and understanding frontmatter parsing issues.
HTML Prettification
When prettify: true is set, the final HTML output is reformatted using BeautifulSoup for readable, indented HTML:
prettify: true
Considerations: Increases build time and output file size slightly. Useful for debugging or when HTML readability matters. For production, false (default) produces more compact output.
Jinja2 Environment Customization
The jinja_env_kwargs option allows you to customize the Jinja2 environment:
jinja_env_kwargs:
# Trim whitespace around blocks
trim_blocks: true
lstrip_blocks: true
# Change template delimiters (useful if content conflicts with {{ }})
variable_start_string: "[["
variable_end_string: "]]"
For the full list of options, see the Jinja2 Environment documentation.
Error Reporting
pdj-sitegen provides detailed error handling with actionable error messages:
Terminal Output
When a build error occurs, you'll see a terse, actionable error message showing:
- The file path and line number where the error occurred
- The problematic source line (when available)
- The root cause of the error
Example output:
on content/blog/post.md:6:
{{ undefined_variable }}
UndefinedError: 'undefined_variable' is undefined
1/15 files failed to convert
Full details: .pdj-sitegen/2024-01-27_14-30-45/
Detailed Error Dumps
For debugging complex errors, full context is saved to .pdj-sitegen/<timestamp>/:
traceback_<file>.txt- Full Python stack tracecontext_<file>.json- Template context (all variables available)template_<file>.txt- The template content that failed
This directory is created automatically when build errors occur. Add .pdj-sitegen/ to your .gitignore:
# pdj-sitegen error dumps
.pdj-sitegen/
Content Organization
pdj-sitegen supports both flat and nested content structures:
Flat structure (using dot notation):
content/
index.md
blog.md
blog.post-1.md
blog.post-2.md
Outputs: index.html, blog.html, blog.post-1.html, blog.post-2.html
Nested structure (using directories):
content/
index.md
blog/
index.md
post-1.md
post-2.md
Outputs: index.html, blog/index.html, blog/post-1.html, blog/post-2.html
Both approaches work with child_docs_dotlist (path prefix matching) and child_docs_folder (same directory) in templates for hierarchical navigation.
Pandoc Filters
pdj-sitegen includes two built-in pandoc filters:
links_md2html
Converts links ending in .md to .html during conversion. Enable in frontmatter or global config:
__pandoc__:
filter: links_md2html
csv_code_table
Converts fenced code blocks with class csv_table to HTML tables.
In your markdown, use a fenced code block with the csv_table class and options:
'''{.csv_table header=1 aligns=LCR caption="My Table"}
Name,Count,Status
Alice,42,Active
Bob,17,Pending
'''
NOTE: in the above, use backticks (`) instead of single quotes (') for the fenced code block; single quotes are used here to avoid rendering issues.
Options:
header: Number of header rows (default: 1)source: Path to external CSV filealigns: Column alignments (L=left, C=center, R=right, D=default)caption: Table caption
Template Variables
The following variables are available in templates:
| Variable | Description | Example |
|---|---|---|
frontmatter |
Full frontmatter dict from the current document | {"title": "My Page"} |
file_meta.path |
Relative path without extension | blog/post-1 |
file_meta.path_html |
HTML output path | blog/post-1.html |
file_meta.path_raw |
Original file path | content/blog/post-1.md |
file_meta.path_to_root |
Relative path prefix to site root (no trailing slash) | . or .. or ../.. |
file_meta.modified_time |
Unix timestamp of last modification | 1706380800.0 |
file_meta.modified_time_str |
Human-readable modification time | 2024-01-27 12:00:00 |
config |
Serialized site configuration | {"output_dir": "docs"} |
docs |
Dictionary of all documents in the site | {"index": {...}} |
child_docs_dotlist |
Documents matching by path prefix | {"blog.post-1": {...}} |
child_docs_folder |
Documents in the same directory | {"about": {...}} |
dir_files |
List of all filenames in the directory | ["index.md", "about.md"] |
dir_subdirs |
List of subdirectory names | ["images", "posts"] |
dir_contents_recursive |
List of all files recursively (relative paths) | ["images/logo.png"] |
content |
Rendered HTML content (in final template only) | <p>Hello</p> |
All frontmatter fields are also available directly (e.g., {{ title }}).
Frontmatter Formats
Frontmatter can be written in YAML, JSON, or TOML:
YAML (recommended):
---
title: My Page
tags: [foo, bar]
---
JSON:
;;;
{"title": "My Page", "tags": ["foo", "bar"]}
;;;
TOML:
+++
title = "My Page"
tags = ["foo", "bar"]
+++
Per-file Overrides
Override global settings in frontmatter:
---
title: My Page
__template__: custom.html.jinja2 # Use different template
__pandoc__:
toc: true # Override pandoc options
number-sections: true
---
similar tools/resources
This project is a descendant of my old project pandoc-sitegen, which was very similar but used mustache templates instead of jinja2.
Some other similar projects:
- https://github.com/brianbuccola/brianbuccola.github.io
- https://runningcrocodile.fi/pandoc_static_site/
- http://pdsite.org/installing/
- https://github.com/locua/pandoc-python-static-site-gen
- https://github.com/lukasschwab/pandoc-blog
- https://github.com/fcanas/bake
if you end up using this script for your site and would me to list it here, email me or submit a PR :)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdj_sitegen-0.0.7.tar.gz.
File metadata
- Download URL: pdj_sitegen-0.0.7.tar.gz
- Upload date:
- Size: 40.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e04709a7015c4554b80b1c92d2a5881308b76493d4aa7a0e67f53bbbf6df6758
|
|
| MD5 |
dd60de3799ab670c68aed55622f8b9ba
|
|
| BLAKE2b-256 |
89ff189f4cb3b9ef37fd4fbd1b83891f86f8073a535900b4a44c3d8845821f90
|
File details
Details for the file pdj_sitegen-0.0.7-py3-none-any.whl.
File metadata
- Download URL: pdj_sitegen-0.0.7-py3-none-any.whl
- Upload date:
- Size: 46.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca1be63cfa347359a5340566aa2c8b6d17a4719a6cae35d31a5413e9ac415f18
|
|
| MD5 |
5235d4f4f2874778453ae7dce03913bd
|
|
| BLAKE2b-256 |
61cb64e9e1815d792275b9ccee7ba4eceb62fddccab1a88cf723464cd994da2f
|