Skip to main content

Simple parser that creates ASTs

Project description

A simple parsing tool in 200 lines of python

Give it a grammar and tokens, and it’ll give you back an ast.

Inputs

Use the parse() function to parse text. The tokens parameter is a dictionary of token names to compiled regular expressions. Prefixing a token with % will ignore it. Token names must be uppercase.

The rules parameter is a dictionary of names to lists of rules, and the start parameter is what node to start parsing first. postlex and postparse and post-processing functions. The default will:

  • after lexing remove all tokens that start with a %
  • after parsing flatten all nodes with 1 child

Rules

A rule is a list of rulesegments. A rulesegment can be a string, which means parse that TOKEN/rule_name. A list rule segement is a group, the last item in it can be one of the following:

  • ?: this group is optional
  • +: match at least one of this group, and possibly more
  • *: match any number of this group, including zero

An example rule looks like this:

[["NUM", "?"], ["var", "*"]]

The rule dictionary is a dictionary of rule_names to lists of these rules, each one being a possibility, like this:

{
    "multi_poly": [
        ["LPAREN", "polynomial", "RPAREN", "LPAREN", "polynomial", "RPAREN"],
        ["NUM", "LPAREN", "polynomial", "RPAREN"],
        ["polynomial"]
    ]
}

Output

The tree format of ParseIt is very simple: for rule instances (nodes) you have a tuple with the first value being the node name, and the second being its children.

Tokens are represented as a tuple of: the token name, the token content, and its position in the stream

treeify

There is also a module called treeify in the package that (if pydot is installed) allows you to create a graphical representation of the generated trees.

Example

example.py has an example grammar and code to convert the generated tree into a pydot

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
parseit-1.0.1-py2.7.egg (9.8 kB) Copy SHA256 hash SHA256 Egg 2.7
parseit-1.0.1-py2-none-any.whl (6.9 kB) Copy SHA256 hash SHA256 Wheel py2
parseit-1.0.1-py3.6.egg (5.1 kB) Copy SHA256 hash SHA256 Egg 3.6
parseit-1.0.1-py3-none-any.whl (6.9 kB) Copy SHA256 hash SHA256 Wheel py3
parseit-1.0.1.tar.gz (4.9 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page