Tools for parsing two-dimensional programming languages
Tools for parsing two-dimensional programming languages.
Suppose we want to parse a diagram representing a path, with
^ each being single steps.
>v >> v ^ >>>^
One way of tokenizing this is to interpret each of these steps as a token, with a value representing its direction.
from parse_2d import Diagram, TinyTokenizer, tokenize diagram = Diagram.from_string(">v >>\n v ^\n >>>^") tokenizers = [ TinyTokenizer(">", 0), TinyTokenizer("v", 1), TinyTokenizer("<", 2), TinyTokenizer("^", 3), ] for token in tokenize(diagram, tokenizers): print(token)
Token has a region and a value. The region is what area it covers in the original diagram, while the value can be any Python object representing what you've tokenized.
Alternatively, you can extract the path as a single token, using the
WireTokenizer, or as a directed path, by subclassing
Diagram is an infinite two-dimensional grid of "symbols", with a distinguished "whitespace" symbol.
Diagrams may be instantiated with a list of lists and the whitespace symbol, or by the
>>> diagram = Diagram([[1, 2], ], 0) >>> diagram[(0, 1)] 3 >>> diagram[(1, 1)] 0 >>> diagram[(-30, 17)] 0
>>> diagram = Diagram.from_string("ab\nc") >>> diagram[(0, 1)] 'c' >>> diagram[(1, 1)] ' '
Region is an area on a diagram. Custom
Regions may be made by inheriting from
Region. The following
Regions are provided by default:
Region consisting of a single point. Has the
location property to provide that point.
Region, aligned with the axes, consisting of the points bounded by
bottom_right, including the top and left edges, and excluding the bottom and right edges (analogously to
Region consisting of a collection of disparate points. Has the
contents property to provide that
frozenset of points.
Token consists of a
region covered, and a
value that the token represents.
Tokenizer is an object for extracting tokens from diagrams. Custom
Tokenizer classes may be made by inheriting from
Tokenizer, and overriding the
extract_token methods. See the
Tokenizer docstring for more details.
Tokenizer for tokens represented by a single symbol.
Extracts a token of value
token_value for every
symbol in the diagram.
Tokenizer for tokens represented by a fixed template of symbols.
template is either a mapping of relative locations to symbols, or a
Extracts a token of value
token_value for every non-overlapping translation of the template found in the parent Diagram.
Tokenizer for wire tokens, represented by a path through a diagram.
A wire consists of multiple symbol "segments", each of which has a fixed collection of directions it can connect to.
segment_connections is a mapping from segment symbols to a collection of that segment's available connections.
Extracts a wire token representing the available connections to that wire.
This class assumes that segments connect all possible incoming directions to all possible outgoing directions. Child classes may override this behavior by overriding the
connections method. See the
WireTokenizer docstring for more details.
Tokenizer for tokens represented by a box of edge symbols.
edge_tokens is a mapping from a side of the box, to the collection of symbols that may be used for that edge.
contents_tokenizer is a function to determine the value of the extracted token, and is passed the entire box (including the edge) as its only parameter.
Yields the non-overlapping tokens found in the
diagram by the list of
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size parse_2d-1.0.0-py3-none-any.whl (11.5 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size parse_2d-1.0.0.tar.gz (8.8 kB)||File type Source||Python version None||Upload date||Hashes View|