A tag parser with optional functional support
Project description
Abstract
This software provides a string token parser, useful in cases where a fixed a priori template string is to be resolved at run time by some process.
Overview
pftag is a simple app that is both a stand alone client as well as a python module. Its main purpose is to parse template strings. A template string is one where sub-parts of the string are tokenized by a token marker. These tokens are resolved at execution time.
From a taxonomy perspective, pftag is an example of a string-based (somewhat opinionated) SGMLish parser.
Installation
Local python venv
For on the metal installations, pip it:
pip install pftag
docker container
docker pull fnndsc/pftag
Runnning
Script mode
To use pftag in script mode simply call the script with appropriate CLI arguments
pftag --tag "run-%timestamp-on-%platform-%arch.log"
run-2023-03-10T13:41:58.921660-05:00-on-Linux-64bit-ELF.log
Module mode
There are several ways to use pftag in python module mode. Perhaps the simplest is just to declare an object and instantiate with an empty dictionary, and then call the object with the tag to process.
If additional values need to be set in the declaration, use an appropriate dictionary. The dictionary keys are identical to the script CLI keys (sans the leading --):
from pftag import pftag
str_tag:str = r'run-%timestamp-on-%platform-%arch.log'
tagger:pftag.Pftag = pftag.Pftag({})
d_tag:dict = tagger(str_tag)
# The result is in the
print(d_tag['results'])
Arguments
The set of CLI arguments can also be passed in a dictionary of
{
"CLIkey1": "value1",
"CLIkey2": "value2",
}
--tag <tagString>
The tag string to process.
[--lookupDictAdd <listOfDictionaryString>]
A string list of additional named lookup dictionary tags and values to
add.
[--tagMarker <mark>]
The marker string that identifies a tag (default "%")
[--funcMarker <mark>]
The marker string that pre- and post marks a function (default "_").
[--funcArgMarker <mark>]
The marker string between function arguments and also between arg list
and function (default "|").
[--funcSep <mark>]
The marker string separating successive function/argument constructs
(default ",").
[--inputdir <inputdir>]
An optional input directory specifier. Reserverd for future use.
[--outputdir <outputdir>]
An optional output directory specifier. Reserved for future use.
[--man]
If specified, show this help page and quit.
[--verbosity <level>]
Set the verbosity level. The app is currently chatty at level 0 and level 1
provides even more information.
[--debug]
If specified, toggle internal debugging. This will break at any breakpoints
specified with 'Env.set_trace()'
[--debugTermsize <253,62>]
Debugging is via telnet session. This specifies the <cols>,<rows> size of
the terminal.
[--debugHost <0.0.0.0>]
Debugging is via telnet session. This specifies the host to which to connect.
[--debugPort <7900>]
Debugging is via telnet session. This specifies the port on which the telnet
session is listening.
Function detail
Overview
In addition to performing a lookup on a template string token, this package can also process the lookup value in various ways. These process functions follow a Reverse Polish Notation (RPN) schema of
tag func1(args1) func2(args2) func3(args3) ...
which reading from left to right is taken as a heap from top to bottom:
tag func1(args1) func2(args2) func3(args3)
where first the <tag> is looked up, then this lookup is processed by <func1>. The result is then processed by <func2>, and so on and so forth, each functional optionally with a set a arguments. This RPN approach also mirrors the standard UNIX piping schema.
Syntax
A function (or function list) that is to be applied to a <tag> is connected to the tag with a <funcMarker> string, usually ’_’. The final function should end with the same <funcMarker>, so
%tag_func1,func2,...,funcN_
will apply the function list in order to the tag value lookup called “tag”; each successive evaluation consuming the result of its predecessor as input.
Some functions can accept arguments. Arguments are passed to a function with a <funcArgMarker> string, typically |, that also separates arguments:
%tag_func|a1|a2|a3_
will pass a1, a2, and a3 as parameters to “func”.
Finally, several functions can be chained within the _…_ by separating the <func>|<argList> constructs with commas, so pedantically
%tag_func1|a1|a2|a3,func2|b1|b2|b3_
All these special characters (tag marker, function pre- and post, arg separation, function separation) can be overriden. For instance, with a selection of
--tagMarker "@" --funcMarker "[" --funcArgMarker "," --funcSep "|"
strings can be specified as
@tag[func,a1,a2,a3|func2,b1,b2,b3[
where preference/legibilty is left to the user.
Development
Instructions for developers.
To debug, the simplest mechanism is to trigger the internal remote telnet session with the --debug CLI. Then, in the code, simply add Env.set_trace() calls where appropriate. These can remain in the codebase (i.e. you don’t need to delete/comment them out) since they are only live when a --debug flag is passed.
Testing
Run unit tests using pytest.
# In repo root dir:
pytest
-30-
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pftag-1.3.2.tar.gz
.
File metadata
- Download URL: pftag-1.3.2.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f7881fb5df6c3a8d2e91a14318f84a5b36ce6d1ef6f80c480ecbf6c0eb7f61c |
|
MD5 | 213e0dd28338d57d4c0d70a083b8e3bf |
|
BLAKE2b-256 | 11713563e323acd969e480f1372d7abbd5317fe7823acc0b01043ba70e014043 |