Skip to main content

LL(1) parser generator

Project description

Arcee

Build Status

It is a Python parser generator, use EBNF-like syntax.

Install

$ pip install Arcee

Example

It's really readable.

grammar:

KEYWORDS        : let, if, zero, -
NUMBER          : \d+(\.\d*)?
ASSIGN          : =
SUBTRACTION     : -
RIGHT_BRACKET   : (
COLON           : ,
LETF_BRACKET    : )
ID              : [A-Za-z]+
SKIP            : [ \\t]+

program : expression ;
expression : zeroexp
    | diffexp
    | ifexp
    | varexp
    | letexp
    | constexp
    ;
constexp : $NUMBER ;
diffexp : '-' '(' expression ',' expression ')' ;
zeroexp : 'zero' '(' expression ')' ;
ifexp : 'if' expression 'then' expression 'else' expression ;
varexp : $ID ;
letexp : 'let' $ID '=' expression 'in' expression ;
$ arcee grammar > result.py

result.py has three parts:

Token

from collections import namedtuple

Token = namedtuple('Token', ['type', 'value', 'line', 'column'])
Program = namedtuple('Program', ['expression'])
# ...

Lexer

import re

def tokenize(code):
    pass # ...

Parser

class Parser:
    def __init__(self, token_list):
        pass

    # ... 

    def parse_expression(self):
        if xxx:
            self.parse_constexp()
        elif yyy:
            self.parse_diffexp()
        #...

    def parse_constexp(self):
        pass

    def parse_diffexp(self):
        pass

    def parse_zeroexp(self):
        pass

    def parse_ifexp(self):
        pass

    def parse_varexp(self):
        pass

    def parse_letexp(self):
        pass

You can parse input such as:

input = '''let a = 0 in if zero(a) then -(a, 1) else -(a, 2)'''

tokens = list(tokenize(input))

parser = Parser(tokens)

parser.parse_program()

result is:

result = Program(
    expression=Expression(
        nonterminal=Letexp(
            ID=Token(type='ID', value='a', line=2, column=4),
            expression1=Expression(
                nonterminal=Constexp(
                    NUMBER=Token(type='NUMBER', value='0', line=2, column=8))),
            expression2=Expression(
                nonterminal=Ifexp(
                    expression1=Expression(
                        nonterminal=Zeroexp(
                            expression=Expression(
                                nonterminal=Varexp(
                                    ID=Token(type='ID', value='a', line=2, column=21))))),
                    expression2=Expression(
                        nonterminal=Diffexp(
                            expression1=Expression(
                                nonterminal=Varexp(
                                    ID=Token(type='ID', value='a', line=2, column=31))),
                            expression2=Expression(
                                nonterminal=Constexp(
                                    NUMBER=Token(type='NUMBER', value='1', line=2,
                                                 column=34))))),
                    expression3=Expression(
                        nonterminal=Diffexp(
                            expression1=Expression(
                                nonterminal=Varexp(
                                    ID=Token(type='ID', value='a', line=2, column=44))),
                            expression2=Expression(
                                nonterminal=Constexp(
                                    NUMBER=Token(type='NUMBER', value='2', line=2,
                                                 column=47))))))))))

Now, you can use this ast to do what you like.

You can also use api in your Python file.

from arcee import generate

grammar = '''KEYWORDS        : let, if, zero, -
NUMBER          : \d+(\.\d*)?
ASSIGN          : =
SUBTRACTION     : -
RIGHT_BRACKET   : (
COLON           : ,
LETF_BRACKET    : )
ID              : [A-Za-z]+
SKIP            : [ \\t]+

program : expression ;
expression : zeroexp
    | diffexp
    | ifexp
    | varexp
    | letexp
    | constexp
    ;
constexp : $NUMBER ;
diffexp : '-' '(' expression ',' expression ')' ;
zeroexp : 'zero' '(' expression ')' ;
ifexp : 'if' expression 'then' expression 'else' expression ;
varexp : $ID ;
letexp : 'let' $ID '=' expression 'in' expression ;'''

print(generate(grammar))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcee-0.1.1.tar.gz (8.7 kB view hashes)

Uploaded Source

Built Distribution

arcee-0.1.1-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page