Skip to main content

Yet Another Word Processor, an automatic word processor for text and Python files

Project description

usage: yawp [-h] [-V] [-v] [-u] [-t TAB_BLANKS] [-w WIDTH] [-l] [-g] [-c] [-p]
            file

Yet Another Word Processor, an automatic word processor for text and Python files

                                    I sound my barbaric yawp over the roofs of the world
                                                                          (Walt Whitman)

CONTENT

    1.  Installation
    2.  Help
    3.  Introduction
    4.  Logic
    5.  Pretty Graphics
    6.  Pretty Chapters
    7.  Python Files
    8.  Messages
    9.  History
    10. Author
    11. Arguments

1. INSTALLATION

Type from terminal:

    $ pip3 install yawp

2. HELP

Type from terminal:

    $ yawp -h

3. INTRODUCTION

The  name  "yawp"  here  means Yet Another Word Processor, and yawp is a simple automatic
word processor for text files and Python files, with the following features:

    • yawp makes a timestamped backup of file to be processed, allowing "undo" operation
    • yawp works "in place", reading formatting and rewriting the file to be processed
    • yawp operation is driven only by the content of the file and by a few parameters
    • yawp justifies "text" at left and right in:
        • unindented paragraphs
        • dot-marked indented paragraphs
    • yawp  accepts  unjustified "pictures" (as schemas, tables and code examples) freely
      intermixed with text
    • yawp's "pretty graphics" feature allows you to sketch draft pictures with lines and
      arrowheads  by  '#'  and '@' characters, which are automatically replaced by proper
      graphic characters
    • yawp's  "pretty  chapters"  feature ensures automatic multi-level chapter numbering
      and generation of table of contents
    • yawp  adopts  an ad hoc policy with Python files, formatting the docstrings but not
      the Python code
    • yawp  is  "stable",  if after a run of yawp on a file you run yawp a second time on
      the same file with the same arguments then the file content doesn't change

As  an  example,  this  documentation  you're  reading  has been formatted by yawp. Other
examples are scattered below.

4. LOGIC

"Pretty text" is an optional feature, it's activated by the -t/--pretty-text argument. If
is  not set, then input lines are not justified, otherwise text justification takes place
as follows.

Let's distinguish in four categories the file lines:

    • a  line  is  an  "empty  line" if it contains no characters (note that in all input
      lines all trailing blanks are stripped away, hence every input line containing only
      blanks becomes an empty line)
    • otherwise a line is a "dot line" if the first nonblank character is a dot character
      '.' or '•' followed by a blank (on output such a '.' is always replaced by a '•')
    • otherwise a line is an "indented line" if it starts with a blank character
    • otherwise a line is an "unindented line"

More exactly, the "dot characters" are:

    • '.' is Unicode "decimal point", Python chr(46) or '\x2e'
    • '•' is Unicode "black small circle", Python chr(8226) or '\u2022'

The yawp algorithm, driven by the input lines, oscillates between two states:

    • "picture state", where input lines are directly written out as they are
    • "text state", where input lines are accumulated into a paragraph buffer for further
      justification and writing at paragraph end

The picture state is the initial state. In this state, if the input is:

    • an empty line or an indented line: the line is written out as is
    • an unindented line: text state is entered, an unindented paragraph begins, the line
      is  shrinked  (initial  and  multiple  blanks  are  eliminated) and assigned to the
      paragraph buffer, paragraph left indentation is set to zero
    • a  dot  line:  text  state  is  entered,  an indented paragraph begins, the line is
      shrinked and assigned to the paragraph buffer, paragraph left indentation is set to
      the position of initial dot character plus two
    • end of input file: processing is terminated

When we are in text state, if the input line is:

    • an  empty  line:  the  paragraph  buffer  is  flushed  (justified,  written out and
      emptied), state goes back to picture state, the empty line is written out
    • an  indented or unindented line: the line is shrinked and appended to the paragraph
      buffer
    • a  dot  line:  paragraph buffer is flushed, a new paragraph is started, the line is
      shrinked and assigned to the paragraph buffer, paragraph left indentation is set to
      the position of initial dot plus two
    • end of input file: paragraph buffer is flushed, processing is ended

For sake of clarity, the following diagram illustrates states and transitions.

        empty line,                                                 unindented line,
        indented line:                                              indented line:
        write                                                       append
        ┌──────────┐                                                ┌──────────┐
        │          │                                                │          │
        │          │                                                │          │
        │    ┌─────┴─────┐ unindented line, dot line: assign  ┌─────┴─────┐    │
        └───▷│           ├───────────────────────────────────▷│           │◁───┘
             │  picture  │      empty line: flush, write      │  text     │
   ─────────▷│           │◁───────────────────────────────────┤           │
             │  state    │                                    │  state    │
             │           │                             ┌──────┤           │◁───┐
             └─────┬─────┘                             │      └─────┬─────┘    │
                   │                                   │            │          │
                   │                                   │            │          │
        ◁──────────┘                        ◁──────────┘            └──────────┘
        end of file:                        end of file:            dot line:
        stop                                flush, stop             flush, assign

Actions associated to transitions are:

    • "write": the input line is immediately written out as is
    • "assign":  the  input  line  is  shrinked  (initial,  multiple and final blanks are
      removed) and assigned to the paragraph buffer
    • "append": the input line is shrinked and appended to the paragraph buffer
    • "flush": the paragraph buffer is flushed (justified, written out and emptied)
    • "stop": execution is finished

5. PRETTY GRAPHICS

"Pretty  graphics"  is  an  optional  feature, it's activated by the -g/--pretty-graphics
argument.

In  a  picture  you  can  sketch  draft  pictures  with  lines  and arrows by two special
characters:

    • use '#' to draw horizontal and vertical lines
    • use '@' to mark an arrowhead

Each  such  character  is  (possibly)  replaced  by yawp with a proper graphic character,
depending  on  the  other  four  characters  around (over, under, at left, and at right).
Isolated characters are not replaced.

This feature is active in image mode only, so it does not work in paragraphs.

An example:

    $ cat graphics.txt
           #############
           # x - y     #
         x #   #############               ######
           #   # x & y #   #               #N#N2#
           #############   # y             ######
               #     y - x #               #0# 0#
               #############               #1# 1#
                                           #2# 4#
                #####     #####            #3# 9#
          #####@# A #####@# B #@#####      #4#16#
                #####     #####     #      #5#25#
                  @         #       #      #6#36#
                  #         #       #      #7#49#
                  #         @       #      #8#64#
                #####     #####     #      #9#81#
          @###### D #@##### C #@#####      ######
                #####     #####
    $ yawp -v -w55 -g -p graphics.txt
    ... Backup of file '~/graphics.txt' into file '~/graphics-2022.01.03-17.14.27.txt'
    ... Rewriting file '~/graphics.txt'
    ... Printing file '~/graphics.txt'
           ┌───────────┐
           │ x - y     │
         x │   ┌───────┼───┐               ┌─┬──┐
           │   │ x & y │   │               │N│N2│
           └───┼───────┘   │ y             ├─┼──┤
               │     y - x │               │0│ 0│
               └───────────┘               │1│ 1│
                                           │2│ 4│
                ┌───┐     ┌───┐            │3│ 9│
          ─────▷│ A ├────▷│ B │◁────┐      │4│16│
                └───┘     └─┬─┘     │      │5│25│
                  △         │       │      │6│36│
                  │         │       │      │7│49│
                  │         ▽       │      │8│64│
                ┌─┴─┐     ┌───┐     │      │9│81│
          ◁─────┤ D │◁────┤ C │◁────┘      └─┴──┘
                └───┘     └───┘
    $

6. PRETTY CHAPTERS

"Pretty  chapters"  is  an  optional  feature, it's activated by the -c/--pretty-chapters
argument.  If  set,  it ensures automatic multi-level chapter numbering and generation of
table of contents.

If pretty chapters is active, the file must contain:

    • a "content line", containing the title of the table of content
    • one or more "chapter lines", containing a multi-level numbering and a chapter title

A line is a content line if:

    • is the first line or is preceded by an empty line
    • is followed by an empty line
    • starts with an uppercase letter
    • contains any character but lowercase letters

A content line must precede all chapter lines. Examples:

    • 'CONTENT'
    • 'TABLE OF CONTENT'
    • 'INDEX OF CHAPTERS'

A line is a chapter line if:

    • is preceded by an empty line
    • is the last line or is followed by an empty line
    • contains:
        • one or more unsigned decimal integer constants, each followed by a '.' dot
        • a blank
        • a chapter title, containing any character but lowercase letters

The "level" of chapter line is the count of number-dot couples in its prefix, examples:

    • '12345. A LEVEL-1 CHAPTER LINE'
    • '1.345. A LEVEL-2 CHAPTER LINE'
    • '0.0.0. A LEVEL-3 CHAPTER LINE'

Chapter lines must follow two sequence rules:

    • first chapter line must be a level-1 chapter line
    • each  other  chapter  line can have a level between 1 and the level of the previous
      chapter line plus 1

Numbers  in  input  don't  matter,  they  are  replaced by the right ones, only the level
matters.

Lines  between the content line and the first chapter line are suppose to contain the old
table  of content, hence they are deleted and replaced by the new automatically generated
table of content.

This  feature  is active in text mode only, so it does not work in pictures. For what has
been  said  we  can  observe  that  each  content line or chapter line must be a one-line
unindented paragraph.

An  error  in content line or in chapter lines could erase a piece of your file, so after
yawp  processing  check  the  result  and  if  needed  go back to previous version by the
-u/--undo argument.

An example:

    $ cat chapters.txt
        TITLE OF DOCUMENT

    TABLE OF CONTENTS

        (old table of content
         will be replaced
         by the new one)

    0. AAA AAA

    ...

    32.33. BBB BBB

    0.0. CCC CCC

    0. DDD DDD

    0.0. EEE EEE

    0.0.0. FFF FFF

    0.0. GGG GGG
    $ yawp -v -w55 -c -p chapters.txt
    ... Backup of file '~/chapters.txt' into file '~/chapters-2022.01.03-17.01.14.txt'
    ... Rewriting file '~/chapters.txt'
    ... Printing file '~/chapters.txt'
        TITLE OF DOCUMENT

    TABLE OF CONTENTS

        1.     Aaa Aaa
        1.1.   Bbb Bbb
        1.2.   Ccc Ccc
        2.     Ddd Ddd
        2.1.   Eee Eee
        2.1.1. Fff Fff
        2.2.   Ggg Ggg

    1. AAA AAA

    ...

    1.1. BBB BBB

    1.2. CCC CCC

    2. DDD DDD

    2.1. EEE EEE

    2.1.1. FFF FFF

    2.2. GGG GGG
    $

7. PYTHON FILES

Python  files  deserve  a  special  treatment.  If  the textfile filename ends with '.py'
extension, then we suppose the file is a Python source, hence we are interested to format
docstrings and not Python code. So the formatting function is alternatively turned on and
off by switch lines. A "switch line" is a line containing a "'''" string.

Note that yawp never formats switch lines, formatting takes place from the line after the
"on" switch line until the line before the next "off" switch line.

So your Python file must follow some simple rules:

    • docstrings to be formatted must start and ended by "'''" and not '"""'
    • long strings not to be formatted must start and end with '"""' and not "'''"
    • a "'''" inside a string must be coded as "\'\'\'"

An  error  in switch lines could format and destroy your Python code. A preliminary check
prints an error message and stops execution before file formatting if the total number of
switch  lines  is  odd. This should intercept 90% of errors, anyway after yawp processing
check the result and if needed go back to previous version by the -u/--undo argument.

An example:

    $ cat pycode.py
    #!/usr/bin/python3

    ''' Text in "on" switch line is not formatted.
    This is a one-line unindented paragraph.

    This is a multi-line unindented paragraph.
    This is a multi-line unindented paragraph.
    This is a multi-line unindented paragraph.

        This is a picture, it remains as is.
                This is a picture, it remains as is.
            This is a picture, it remains as is.

        . This is a multi-line indented paragraph.
    This is a multi-line indented paragraph.
        This is a multi-line indented paragraph.
            . This is another multi-line indented paragraph.
    This is another multi-line indented paragraph.
        This is another multi-line indented paragraph.
    ''' # Text in "off" switch line is not formatted.

    def double(x): # Python code is not formatted.
        '''
    This is another multi-line unindented paragraph.
    This is another multi-line unindented paragraph.
    This is another multi-line unindented paragraph.
    '''
        return x + x # Python code is not formatted.
    $ yawp -v -w55 -p pycode.py
    ... Backup of file '~/pycode.py' into file '~/pycode-2022.01.03-16.49.27.py'
    ... Rewriting file '~/pycode.py'
    ... Printing file '~/pycode.py'
    #!/usr/bin/python3

    ''' Text in "on" switch line is not formatted.
    This is a one-line unindented paragraph.

    This  is  a  multi-line unindented paragraph. This is a
    multi-line  unindented  paragraph. This is a multi-line
    unindented paragraph.

        This is a picture, it remains as is.
                This is a picture, it remains as is.
            This is a picture, it remains as is.

        • This  is a multi-line indented paragraph. This is
          a   multi-line  indented  paragraph.  This  is  a
          multi-line indented paragraph.
            • This    is    another   multi-line   indented
              paragraph.   This   is   another   multi-line
              indented    paragraph.    This   is   another
              multi-line indented paragraph.
    ''' # Text in "off" switch line is not formatted.

    def double(x): # Python code is not formatted.
        '''
    This  is  another multi-line unindented paragraph. This
    is  another  multi-line  unindented  paragraph. This is
    another multi-line unindented paragraph.
    '''
        return x + x # Python code is not formatted.
    $

8. MESSAGES

All  messages  are  written  on  stderr,  in  order to avoid interference with -p/--print
option, which writes on stdout.

There are two types of messages:
    • "information messages" say what's going on if -v/--verbose argument is set:
        • ... Backup of file '...' into file '...'
        • ... Restore of file '...' from file '...'
        • ... Processing of file '...'
        • ... Printing of file '...'
    • "error  messages"  are  written  when  execution  must be interrupted, in this case
      backup is not performed and file is not rewritten:
        • !!! File '...' not found, program halted
        • !!! Backup file for file '...' not found', program halted
        • !!! Line ..., impossible to left-justify: '...', program halted
        • !!! Line ..., impossible to right-justify: '...', program halted
        • !!! Line ... is too long: '...', program halted
        • !!! Python file, odd number of switch lines, program halted

9. HISTORY

    • version 0.4.2
        • reformatted: generated table of contents in pretty chapters option

    • version 0.4.1
        • first version published on pypi.org

10. AUTHOR

Written by Carlo Alessandro Verre, carlo.alessandro.verre@gmail.com.

11. ARGUMENTS

If -u is set then -w -l -g and -c are allowed but have no effect.

positional arguments:
  file                  file to be formatted (or restored if -u/--undo)

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -v, --verbose         show what happens
  -u, --undo            don't format, restore file to the previous version
                        (default: backup and format)
  -t TAB_BLANKS, --tab-blanks TAB_BLANKS
                        blanks replacing a tab char in input (default: 4)
  -w WIDTH, --width WIDTH
                        output line width (default: 89)
  -l, --left-only       justify at left only (default: at left and right)
  -g, --pretty-graphics
                        replace '#' and '@' in pictures with lines and
                        arrowheads
  -c, --pretty-chapters
                        renumber text chapters and refresh table of contents
  -p, --print           at end print formatted (or restored) file on stdout

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yawp-0.4.2.tar.gz (19.3 kB view hashes)

Uploaded Source

Built Distribution

yawp-0.4.2-py3-none-any.whl (32.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page