Skip to main content

Another PEG Parsing Tool

Project description

mo-parsing

An experimental fork of pyparsing

Summary of Differences

This has been forked to experiment with faster parsing in the moz-sql-parser.

More features

  • Added Engine, which controls parsing context and whitespace (think lexxer)
  • faster infix parsing (main reason for this fork)
  • ParseResults point to ParserElement for reduced size
  • packrat parser is always on
  • less stack used
  • the wildcard ("*") could be used to indicate multi-values are expected; this is not allowed: all values are multi-values
  • all actions are in f(token, index, string) form, which is opposite of pyparsing's f(string, index token) form

More focused

  • removed all backward-compatibility settings
  • no support for binary serialization (no pickle)
  • ParseActions must adhere to a strict interface

More functional

  • tokens are static, can not be changed, parsing functions must emit new objects
  • ParserElements are static: Many are generated during language definition

Details

The Engine

The mo_parsing.engine.CURRENT is used during parser creation: It is effectively the lexxer with additional features to simplify the language definition. You declare a standard Engine like so:

with Engine() as engine:
    # PUT YOUR LANGUAGE DEFINITION HERE

If you are declaring a large language, and you want to minimize indentation, and you are careful, you may also use this pattern:

engine = Engine().use()
# PUT YOUR LANGUAGE DEFINITION HERE
engine.release()

The engine can be used to set global parsing parameters, like

  • set_whitespace() - set the ignored characters (like whitespace)
  • add_ignore() - include whole patterns that are ignored (like commnets)
  • set_debug_actions() - insert functions to run for detailed debuigging
  • set_literal() - Set the definition for what Literal() means
  • set_keyword_chars() - For default Keyword()

The engine.CURRENT is added to every parse element created, and it is used during parsing to packrat the current parsed string.

Navigating ParseResults

ParseResults are in the form of an n-ary tree; with the children found in ParseResults.tokens. Each ParseResult.type points to the ParserElement that made it. In general, if you want to get fancy with post processing (or in a parseAction), you will be required to navigate the raw tokens to generate a final result

There are some convenience methods;

  • __iter__() - allows you to iterate through parse results in depth first search. Empty results are skipped, and Grouped results are treated as atoms (which can be further iterated if required)
  • name is a convenient property for ParseResults.type.token_name
  • __getitem__() - allows you to jump into the parse tree to the given name. This is blocked by any names found inside Grouped results (because groups are considered atoms).

addParseAction

Parse actions are methods that are run after a ParserElement found a match.

  • Parameters must be accepted in (tokens, index, string) order (the opposite of pyparsing)
  • Parse actions are wrapped to ensure the output is a legitimate ParseResult
    • If your parse action returns None then the result is the original tokens
    • If your parse action returns an object, or list, or tuple, then it will be packaged in a ParseResult with same type as tokens.
    • If your parse action returns a ParseResult then it is accepted even if is belongs to some other pattern

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mo-parsing-3.111.20292.tar.gz (57.6 kB view details)

Uploaded Source

File details

Details for the file mo-parsing-3.111.20292.tar.gz.

File metadata

  • Download URL: mo-parsing-3.111.20292.tar.gz
  • Upload date:
  • Size: 57.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for mo-parsing-3.111.20292.tar.gz
Algorithm Hash digest
SHA256 b5bf7733d4e4cf8248badfe90e9e74b70eaf592747a275c54e7f6e0d0fd283e5
MD5 031f4dc884858e3d62b16cb36ae0007d
BLAKE2b-256 10b85bffbca1a87de717e5ce1a2bc85a018335c7e9829ed37f9e3fd28f7f84e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page