Tools for parsing two-dimensional programming languages
Project description
parse_2d
Tools for parsing two-dimensional programming languages.
Example
Suppose we want to parse a diagram representing a path, with >, v, <, and ^ each being single steps.
>v >>
v ^
>>>^
One way of tokenizing this is to interpret each of these steps as a token, with a value representing its direction.
from parse_2d import Diagram, TinyTokenizer, tokenize
diagram = Diagram.from_string(">v >>\n v ^\n >>>^")
tokenizers = [
TinyTokenizer(">", 0),
TinyTokenizer("v", 1),
TinyTokenizer("<", 2),
TinyTokenizer("^", 3),
]
for token in tokenize(diagram, tokenizers):
print(token)
Each Token has a region and a value. The region is what area it covers in the original diagram, while the value can be any Python object representing what you've tokenized.
Alternatively, you can extract the path as a single token, using the WireTokenizer, or as a directed path, by subclassing WireTokenizer.
A more complete sample is also provided, to demonstrate the use of these tools, by parsing the Circuit Diagram language.
Reference
Diagram
A Diagram is an infinite two-dimensional grid of "symbols", with a distinguished "whitespace" symbol. Diagrams may be instantiated with a list of lists and the whitespace symbol, or by the from_string method.
Manual instantiation
>>> diagram = Diagram([[1, 2], [3]], 0)
>>> diagram[(0, 1)]
3
>>> diagram[(1, 1)]
0
>>> diagram[(-30, 17)]
0
from_string
>>> diagram = Diagram.from_string("ab\nc")
>>> diagram[(0, 1)]
'c'
>>> diagram[(1, 1)]
' '
Region
A Region is an area on a diagram. Custom Regions may be made by inheriting from Region. The following Regions are provided by default:
TinyRegion(location)
A Region consisting of a single point. Has the location property to provide that point.
RectRegion(top_left, bottom_right)
A rectangular Region, aligned with the axes, consisting of the points bounded by top_left and bottom_right, including the top and left edges, and excluding the bottom and right edges (analogously to range).
SparseRegion(contents)
A Region consisting of a collection of disparate points. Has the contents property to provide that frozenset of points.
Token
A Token consists of a region covered, and a value that the token represents.
Tokenizer
A Tokenizer is an object for extracting tokens from diagrams. Custom Tokenizer classes may be made by inheriting from Tokenizer, and overriding the starts_on and extract_token methods. See the Tokenizer docstring for more details.
TinyTokenizer(symbol, value)
Tokenizer for tokens represented by a single symbol.
Extracts a token of value token_value for every symbol in the diagram.
TemplateTokenizer(template, token_value)
Tokenizer for tokens represented by a fixed template of symbols.
The template is either a mapping of relative locations to symbols, or a Diagram.
Extracts a token of value token_value for every non-overlapping translation of the template found in the parent Diagram.
WireTokenizer(segment_connections)
Tokenizer for wire tokens, represented by a path through a diagram.
A wire consists of multiple symbol "segments", each of which has a fixed collection of directions it can connect to.
The segment_connections is a mapping from segment symbols to a collection of that segment's available connections.
Extracts a wire token representing the available connections to that wire.
This class assumes that segments connect all possible incoming directions to all possible outgoing directions. Child classes may override this behavior by overriding the connections method. See the WireTokenizer docstring for more details.
BoxTokenizer(edge_symbols, contents_tokenizer)
Tokenizer for tokens represented by a box of edge symbols.
edge_tokens is a mapping from a side of the box, to the collection of symbols that may be used for that edge.
contents_tokenizer is a function to determine the value of the extracted token, and is passed the entire box (including the edge) as its only parameter.
tokenize(diagram, tokenizers)
Yields the non-overlapping tokens found in the diagram by the list of tokenizers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parse_2d-1.0.0.tar.gz.
File metadata
- Download URL: parse_2d-1.0.0.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf6cfe1238e40be56ab689fb0c8c25c53d4043b681d2d370da33645dd2592803
|
|
| MD5 |
ed2735aabe528ddce6b32014c0c15281
|
|
| BLAKE2b-256 |
cd90b8cfeacc34404d309e40f3ced233778e5bf2a39dd0753a49bdbee4862722
|
File details
Details for the file parse_2d-1.0.0-py3-none-any.whl.
File metadata
- Download URL: parse_2d-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af49420675134582cb2a1885145080dbb83b1254a977026e8ab0e3911c93c9b1
|
|
| MD5 |
27de187ca125858121a0fbc351c66209
|
|
| BLAKE2b-256 |
50ed70132580627b7c8549da6e0c9547f54baae0de9faa3946dcb30eaac054cf
|