Ruamel Yaml Doc preprocessor (pronounced: /rɑɪt/ like the verb "write")

## Project description

ryd ( /rɑɪt/, pronounced like the verb “write” ) is a preprocessor for text based documents, that builds upon the multi-document capabilities of YAML files/streams.

The use of multiple-documents in ryd allows for clear separation between document text and any programs referenced in those text. Thereby making it possible to run (c.q. compile) the program parts of a document, e.g. to check whether they are syntactically correct. It can also capture the actual output of those programs to be included in the document. It is also possible to recognise different documents and thus run different formatters then recombining the documents.

This allows for easier maintenance of (correct) program sources, in document source texts like reStructuredText, LaTeX, Markdown, etc.

The first of the documents in a ryd file has, at the root-level, a mapping. This mapping is the ryd configuration metadata for the rest of the stream of documents in the file. The metadata is used to define ryd document version used, what the basic text style is (currently rst for RestructuredText, so for StackOverflow-markdown), if any postprocessing (PDF, HTML) needs to be done, and other configuration information. This first document doesn’t normally have any directives. That the document is YAML 1.2 is implicit, so no %YAML 1.2 directive is needed and without directive and no preceding document, you do not have a directives-end marker line (---) at the top.

The documents following the first document are normally block style literal scalars with an optional tag. The tag influences how the scalar string is processed within the selected output text style.

## Example

version: 0.2
text: rst
fix_inline_single_backquotes: true
--- |
Example Python program
++++++++++++++++++++++

This is an example of a python program:
--- !python |
n = 7
print(n**2 - n)
--- !stdout |


this will generate (using: ryd convert test.ryd) the following test.rst:

Example Python program
++++++++++++++++++++++

This is an example of a Python program:

.. code:: python

n = 7
print(n**2 - n)

.. code::

42


## Postprocessing

The output can be converted to PDF using rst2pdf or HTML using rst2html with images embedded using webpage2html. Invocation of these programs can be specified in the metatada (e.g. post: pdf).

The (image embedded) HTML output has the indent of indented lines of included code fragments inserted as actual spaces. So you copy and paste program code (or YAML) from HTML without a problem, something not possible with PDF files generated by rst2pdf.

ryd generates its text output stand-alone, but the programs actually used for postprocessing have to be installed seperately and available in your PATH.

(There is currently no postprocessing for Markdown, as pandoc is not working on Arch (Mark 2022))

## Config

You can create a file ~/.config/ryd/ryd.yaml with defaults for the commandline options. To always embed images when converting to HTML and always run the (global) verbose option:

global:
verbose: true
convert:
embed: true

### Command-line options

The command-line of ryd consists of multiple components:

ryd [--global-option] command [--options] [arguments]


Although not indicated most global options can occur after the command as well.

#### commands

convert             generate output as per first YAML document
from-rst (fromrst)  convert .rst to .ryd


You’ll most often use convert it takes one or more filenames as argument and generates output as specified in the ryd configuration data. Some options allow you to override settings there (e.g. --pdf and -no-pdf)

The command from-rst converts a .rst file into .ryd doing some section underline checking and adding the ryd configuration data document.

Doing ryd command --help might indicate extra options that have not yet made it into the documentation and/or that are incompletely implemented.

### Documents and document tags

Each YAML document has to be separated from other documents in the stream by at least the end of directive marker ---. Apart from the first document, most documents contain a single, multi-line, non-indented, scalar. The end of directives marker is therefor followed by the pipe (|) symbol, which is the YAML indication for a multi-line literal scalar.

That scalar can be “typed” in the normal way of YAML by inserting a tag before the |. E.g. a document that is a type of Python program has a tag !python and thus starts with:

--- !python |


What a document tag exactly does, depends on the tag, but, potentially, also, on the output file format selected, on previously processed tagged documents, other .ryd files processed previously and the environment.

In addition to the basic tag (like !python), a tag can have subfunctions such as !python-pre. If an unknown subfunction is specified you’ll get a runtime error. The following are short descriptions for all tags, independent of the selected output format:

!code
Include program in text. Do not mark as executable, doesn’t influence !stdout.
!comment
The whole document will be discarded, i.e. not included in the output.
!inc
Include the content of the listed files (indented), without other processing, into the output. Preceed with :: if necessary
!inc-raw
Include the content of the listed files (indented), without other processing, into the output. Preceed with :: if necessary
!lastcompile
Include output from last compilation as code.
!nim
Include Nim program in text. Prefix and mark as executable.
!nim-pre
Include Nim program in text. Prefix and mark as executable.
!python
Include Python program in text. Prefix and mark as executable.
!python-hidden
Include Python program in text. Prefix and mark as executable.
!python-pre
Include Python program in text. Prefix and mark as executable.
!stdout
Include output from last executable document (e.g. !python) as code.
!stdout-raw
Include output from last executable document (e.g. !python) as code.
!yamlout
Include output from last executable document (e.g. !python) as code tagged as YAML document.
!zig
Include Zig program in text. Prefix and execute setting !stdout.
!zig-pre
Include Zig program in text. Prefix and execute setting !stdout.
!zsh
run each line in zsh, interspacing the lines with the output

## RST

The output to .rst expects non-code YAML documents to be valid reStructuredText. Any non-tagged documents, i.e. those starting with:

--- |


are assumed to be text input, in the format specified in the ryd configuration data.

### Section underlining

Because of the special meaning of --- (and ...) at the beginning of a line, followed by newline or space, the section under/over-line characters used in .ryd files that are source for .rst should not use - or . sequences if a any of those section names consist of three letters (e.g. a section named API or RST). It is recommended to use the following scheme:

Sections, subsections, etc. in .ryd files
# with over-line, for parts
* with over-line, for chapters
=, for sections
+, for subsections
^, for sub-subsections
", for paragraphs


### Single backquotes

The fix_inline_single_backquotes: true tells ryd to indicate lines that have single backquotes, that need fixing (by replacing them with double backquotes):

README.ryd
47: this will generate (ryd convert test.ryd) the following
--^
--^


(If you are used to other inline code markup editing e.g. on Stack Overflow, that uses single backquotes, you’ll come to appreciate this.)

### Python

Python code is indicated by:

--- !python |


The document is inserted into the .rst preceded by .. code:: python and each line with a two space indent.

If your program relies on specific packages, those packages, need to be available in the environment in which ryd is started (which can e.g. be a specifically set up virtualenv)

It is possible to have “partial programs” by preceding a python document with e.g.:

--- !python-pre |
from __future__ import print_function
import sys
import ruamel.yaml
from ruamel.std.pathlib import Path, pushd, popd, PathLibConversionHelper
pl = PathLibConversionHelper()


Such a block is pre-pended to all following --- !python | documents (until superseded by another --- !python-pre | block)

### Captured output

The output from the last program that was run (--- !python |) is stored and can be post-pended to a reStructuredText document by tagging it with !stdout (i.e. --- !stdout |)

### non-running code

A document tagged !code will be represented as one tagged !python, but the code will not be run (and hence the output used for !stdout not changed).

### Zig

Zig code is indicated by:

--- !zig |


The document is inserted as with Python, there can be a !zig-pre document, and output is captured and displayed with --- !stdout |):

// const std = @import("std");   is defined in zig-pre

pub fn main() !void {
const stdout = std.io.getStdOut().writer();
try stdout.print("Hello, {s}!\n", .{"world"});
}


which outputs:

Hello, world!


The compilation is done with option build-exe .

let a = 123
let x = 0b0010_1010
echo(fmt"The answer to the question: {x}")


which outputs:

The answer to the question: 42


The compilation is done with options --verbosity:0 --hint[Processing]:off .

#### compiler output

If you are interested in the textual output of the compiler you can use --- !lastcompile |

/tmp/ryd-of-anthon/ryd-1169/tmp_02.nim(4, 5) Hint: 'a' is declared but not used [XDeclaredButNotUsed]


Block style literal scalars do not allow YAML comments. To insert comments in a text, either use the format acceptable by the output, e.g. when generating .rst use:

..
this will show up in the resulting .rst file, but will
not render


Alternatively you can create a comment YAML document (--- !comment |), for which the text will not be represented in the output file format at all.

If you already have a tagged document, e.g. --- !python | document, you can make it into a comment by inserting comment-:

--- !comment-python |


This has been implemented by not reporting an error when an unkown subfunction on !comment is invoked.

[ ] not done yet
[v] done
[x] no longer going to do


resulting in

☐ not done yet

☑ done

☒ no longer going to do

(it would be nice to know if there is a way to create a real list with user specified bullet items)

Before trying to load a tag !yourtag from its know files, ryd tries to load them from ~/.config/ryd/tag/. This mechanism can be used to implement your improvements over existing tags or extendeding the tags with your own.

Let’s assume you want to explain the use of mypy in your ryd document, including output of a mypy run on some source. First create a file ~/.config/ryd/tag/mypy.tag with the following content:

# coding: 'utf-8'

from __future__ import annotations

import os
import subprocess
from typing import Any, TYPE_CHECKING
from ryd._tag._handler import ProgramHandler

if TYPE_CHECKING:
from ryd._convertor._base import ConvertorBase
else:
ConvertorBase = Any

class Mypy(ProgramHandler):   # class name is capitalization of the stem of the filename
def __init__(self, convertor: ConvertorBase) -> None:
super().__init__(convertor)
self._pre = ''

def pre(self, d: Any) -> None:  # like !python-pre you can have !mypy-pre
self._pre = str(d)

def __call__(self, d: Any) -> None:
"""
Include Python program in text. Prefix, save and run mypy, setting !stdout.
"""
s = str(d)
# depending on the util, you may not need to do a chdir to the tempdir
old_dir = os.getcwd()
self.c.temp_dir.chdir()
path = self.c.temp_file_path('.py')
path.write_text(self._pre + s)
self.c.last_output = subprocess.run([
'mypy',
'--strict', '--follow-imports', 'silent', '--implicit-reexport',
str(path),
], stderr=subprocess.STDOUT, stdout=subprocess.PIPE, encoding='utf-8').stdout
os.chdir(old_dir)
self.c.add_code(s, 'python')  # format the code as python


and you include in your ryd document:

--- !mypy |
def main(arg1, arg2):
return arg1

--- !stdout |
which gives:

--- |

from the mypy output you can see ....


Your .rst will then contain the Python source and the mypy output:

.. code:: python

def main(arg1, arg2):
return arg1

which gives:

.. code::

tmp_03.py:2: error: Function is missing a type annotation
Found 1 error in 1 file (checked 1 source file)

from the mypy output you can see ....


## History

ryd grew out of a in-house solution where sections of reStructuredText files were updated, in-place, by running Python programs specified in separate files. Also allowing the inclusion of the (error) output.

An example of this can be seen in this old version of the example.rst file of the ruamel.yaml package:

Basic round trip of parsing YAML to Python objects, modifying
and generating YAML::

import sys
from ruamel.yaml import YAML

inp = """\
# example
name:
# details
family: Smith   # very common
given: Alice    # one of the siblings
"""

yaml = YAML()
code['name']['given'] = 'Bob'

yaml.dump(code, sys.stdout)

.. example code small.py

Resulting in ::

# example
name:
# details
family: Smith   # very common
given: Bob      # one of the siblings

.. example output small.py


The program was inserted before the .. example code line and its output before .. example output, replacing all the text starting after the previous ::

The small.py referenced a separate file for this piece of code. This resulted in multiple source files that were associated with a single .rst file. There was no mechanism to have partial programs that could be tested by execution, which precluded getting output from such program as well.

Although the code could have been edited in place, and used to get the output, this would force one to use the extra indentation required for lines following ReST’s ::.

Once this system came under review, the solution with a structured YAML header, as used with various file formats, combined with multiple document consisting of (tagged) top level, non-indented, block style literal scalars, was chosen instead.

In early 2022 an update of the 0.1 format was implemented to make tags and convertors into seperate files, thereby making them more easily upgradable and extensible.

## Project details

Uploaded source
Uploaded py3