Statically generating standablone regex-based lexers and highly optimized LL(k) parsers
Project description
Frontend-For-Free
A bootstrap of RBNF.hs to generate standalone parsers targeting multiple programming languages.
Standalone: the generated code can run without runtime dependencies other than the language and standard libraries.
Installation
Install from Sources
You can install binary files via: The Haskell Tool Stack.
sh> stack install .
Install from Binaries
Otherwise, binary files for various platforms(Win64, Generic Linux, MAC OSX 10.13-10.15) are released on GitHub.
Download it from Releases, add fff-lex
and fff-pgen
to your PATH.
Besides, You Need a Python Wrapper
frontend-for-free
now provides a wrapper for Python only:
pip install frontend-for-free
or install it from GitHub.
Usage
sh> fff <xxx>.rbnf --trace [--lexer_out <xxx>_lex.py] [--parser_out <xxx>_parser.py]
sh> # note that you should also provide a <xxx>.rlex file
sh> ls | grep <xxx>
<xxx>_parser.py <xxx>_lex.py
See examples at runtest.
What is Frontend-For-Free?
A framework for generating context-free parsers with the following features:
- cross-language
- distributed with a lexer generator, but feel free to use your own lexers.
- LL(k) capability
- efficient left recursions
- standalone No 3rd party library is introduced, while the generator requires Python3.6+ with a few dependencies.
- defined with a most intuitive and expressive BNF derivative
-
action/rewrite:
pair := a b { ($1, $2) }
-
parameterised polymorphisms for productions:
nonEmpty[A] := A { [$1] } | hd=A tl=nonEmpty[A] { tl.append(hd); tl }
where
append
shall be provided by the user code.
-
Currently,
- the parser generator support for a programming language is hard coded in
src/RBNF/BackEnds/<LanguageName>.hs
. - the lexer generator support for a programming language is hard coded in
ffflex.py
.
Galleries
-
Parsing JSON
-
Parser as Interpreter: Implementing a Programming Language within 20 Minutes
-
Parsing LaTeX
- lexer: gkdtex.rlex
- parser: gkdtex.gg
-
Parsing LLVM IR(A major subset)
- lexer: llvmir.rlex
- parser: llvmir.rbnf
-
Parsing nested arithmetic expressions
- lexer: arith.rlex
- parser: arith.rbnf
-
Parsing the BNF derivative used by FFF(bootstrap)
- lexer: fffbnf.rlex
- parser: fffbnf.rbnf
-
Parsing ML syntax:
-
(OLD VER 0)Parsing ML syntax and convert it to DrRacket
- lexer: yesml.rlex
- parser: yesml.rbnf
-
(OLD VER 1)Muridesu: 以木兰的方式, 三小时做出强比Python,形似GoLang的语言
- lexer: muridesu.rlex
- parser: muridesu.exrbnf
-
(OLD VER 2)Parsing Python ASDL files
- lexer: asdl.rlex
- parser: asdl.exrbnf
OLD VER 2, OLD VER 1 and OLD VER 0 are out-of-date, hence the code generation does not work with the master branch.
However, the generated code is permanent and now still working.
Further, OLD VER 2 can be easily up-to-date by manually performing the following transformations:
-
changing slots
$0, $1, $2, ...
to$1, $2, $3, ...
-
changing
list(rule)
tolist[rule]
, and provide the definition oflist
production:list[p] ::= p { [$1] } | list[p] p { $1.append($2); $1 }
-
changing
separated_list(sep, rule)
toseparated_list[sep, rule]
, and provide the definition ofseparated_list
production:separated_list[sep, p] ::= p { [$1] } | list[p] sep p { $1.append($3); $1 }
End-To-End: A Common Pattern for Using the Generated Parser
For most cases, you don't need to understand any parsing components like lexers, token tables, states, etc.
In fact, you can easily access your generated parser simply via the following function parse(source_code, filename="<unknown>")
:
from <the generated parser module> import *
from <the generated lexer module> import lexer
__all__ = ["parse"]
_parse = mk_parser()
def parse(text: str, filename: str = "unknown"):
tokens = lexer(filename, text)
status, res_or_err = _parse(None, Tokens(tokens))
if status:
return res_or_err
msgs = []
lineno = None
colno = None
filename = None
offset = 0
msg = ""
for each in res_or_err:
i, msg = each
token = tokens[i]
lineno = token.lineno + 1
colno = token.colno
offset = token.offset
filename = token.filename
break
e = SyntaxError(msg)
e.lineno = lineno
e.colno = colno
e.filename = filename
e.text = text[offset - colno:text.find('\n', offset)]
e.offset = colno
raise e
Calling parse
will get you the expected result, or a considerably readable error message.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file frontend_for_free-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: frontend_for_free-0.5.0-py3-none-any.whl
- Upload date:
- Size: 36.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0b32d95a95a4cca93f4920bded0c83df4f312cb5edf79f0fd0be40e2e38966d |
|
MD5 | a380a7a14b10e64d359d29401eb5ac79 |
|
BLAKE2b-256 | 4ddbb300ff5c8b3655ee32bf6154820bb5e8ae0cc02c30fcaa09d7ce7897492a |