TatSu takes a grammar in a variation of EBNF as input, and outputs a memoizing PEG/Packrat parser in Python.
Project description
At least for the people who send me mail about a new language that they’re designing, the general advice is: do it to learn about how to write a compiler. Don’t have any expectations that anyone will use it, unless you hook up with some sort of organization in a position to push it hard. It’s a lottery, and some can buy a lot of the tickets. There are plenty of beautiful languages (more beautiful than C) that didn’t catch on. But someone does win the lottery, and doing a language at least teaches you something.
Dennis Ritchie (1941-2011) Creator of the C programming language and of Unix
竜 TatSu
竜 TatSu (the successor to Grako) is a tool that takes grammars in a variation of EBNF as input, and outputs memoizing (Packrat) PEG parsers in Python.
竜 TatSu can compile a grammar stored in a string into a tatsu.grammars.Grammar object that can be used to parse any given input, much like the re module does with regular expressions, or it can generate a Python module that implements the parser.
竜 TatSu fully supports left-recursive rules in PEG grammars using the algorithm by Laurent and Mens. The generated AST has the expected left associativity.
Installation
$ pip install TatSu
Using the Tool
竜 TatSu can be used as a library, much like Python’s re, by embedding grammars as strings and generating grammar models instead of generating Python code.
tatsu.compile(grammar, name=None, **kwargs)
Compiles the grammar and generates a model that can subsequently be used for parsing input with.
tatsu.parse(grammar, input, **kwargs)
Compiles the grammar and parses the given input producing an AST as result. The result is equivalent to calling:
model = compile(grammar) ast = model.parse(input)
Compiled grammars are cached for efficiency.
tatsu.to_python_sourcecode(grammar, name=None, filename=None, **kwargs)
Compiles the grammar to the Python sourcecode that implements the parser.
This is an example of how to use 竜 TatSu as a library:
GRAMMAR = '''
@@grammar::CALC
start = expression $ ;
expression
=
| expression '+' term
| expression '-' term
| term
;
term
=
| term '*' factor
| term '/' factor
| factor
;
factor
=
| '(' expression ')'
| number
;
number = /\d+/ ;
'''
if __name__ == '__main__':
import pprint
import json
from tatsu import parse
from tatsu.util import asjson
ast = parse(GRAMMAR, '3 + 5 * ( 10 - 20 )')
print('# PPRINT')
pprint.pprint(ast, indent=2, width=20)
print()
print('# JSON')
print(json.dumps(asjson(ast), indent=2))
print()
竜 TatSu will use the first rule defined in the grammar as the start rule.
This is the output:
# PPRINT
[ '3',
'+',
[ '5',
'*',
[ '10',
'-',
'20']]]
# JSON
[
"3",
"+",
[
"5",
"*",
[
"10",
"-",
"20"
]
]
]
License
You may use 竜 TatSu under the terms of the BSD-style license described in the enclosed LICENSE.txt file. If your project requires different licensing please email.
Documentation
For a detailed explanation of what 竜 TatSu is capable of, please see the documentation.
Questions?
For general Q&A, please use the [tatsu] tag on StackOverflow.
Changes
See the CHANGELOG for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for TatSu-4.2.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de4c681b65c8b8ab072393e8b423ca2773f9f93720d6386d99d7bd818cb0973f |
|
MD5 | e465cea73613f399e414ed84fce3d8d7 |
|
BLAKE2b-256 | 064ff6b8ecbf4133fe8d71917fd828975d4a1a5107baae6d755c06e9a303b136 |