Another PEG Parsing Tool

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

More Parsing!

A fork of pyparsing for faster parsing

Installation

This is a pypi package

pip install mo-parsing

Usage

This module allows you to define a PEG parser using predefined patterns and Python operators. Here is an example

>>> from mo_parsing import Word
>>> from mo_parsing.utils import alphas
>>>
>>> greet = Word(alphas)("greeting") + "," + Word(alphas)("person") + "!"
>>> result = greet.parse_string("Hello, World!")

The result can be accessed as a nested list

>>> list(result)
['Hello', ',', 'World', '!']

The result can also be accessed as a dictionary

>>> dict(result)
{'greeting': 'Hello', 'person': 'World'}

Read the pyparsing documentation for more

The `Whitespace` Context

The mo_parsing.whitespaces.CURRENT is used during parser creation: It is effectively defines what "whitespace" to skip during parsing, with additional features to simplify the language definition. You declare "standard" Whitespace like so:

with Whitespace() as whitespace:
    # PUT YOUR LANGUAGE DEFINITION HERE (space, tab and CR are "whitespace")

If you are declaring a large language, and you want to minimize indentation, and you are careful, you may also use this pattern:

whitespace = Whitespace().use()
# PUT YOUR LANGUAGE DEFINITION HERE
whitespace.release()

The whitespace can be used to set global parsing parameters, like

set_whitespace() - set the ignored characters (default: "\t\n ")
add_ignore() - include whole patterns that are ignored (like comments)
set_literal() - Set the definition for what Literal() means
set_keyword_chars() - For default Keyword() (important for defining word boundary)

Navigating ParseResults

The results of parsing are in ParseResults and are in the form of an n-ary tree; with the children found in ParseResults.tokens. Each ParseResult.type points to the ParserElement that made it. In general, if you want to get fancy with post processing (or in a parse_action), you will be required to navigate the raw tokens to generate a final result

There are some convenience methods;

__iter__() - allows you to iterate through parse results in depth first search. Empty results are skipped, and Grouped results are treated as atoms (which can be further iterated if required)
name is a convenient property for ParseResults.type.token_name
__getitem__() - allows you to jump into the parse tree to the given name. This is blocked by any names found inside Grouped results (because groups are considered atoms).

Parse Actions

Parse actions are methods that run after a ParserElement found a match.

Parameters must be accepted in (tokens, index, string) order (the opposite of pyparsing)
Parse actions are wrapped to ensure the output is a legitimate ParseResult
- If your parse action returns None then the result is the original tokens
- If your parse action returns an object, or list, or tuple, then it will be packaged in a ParseResult with same type as tokens.
- If your parse action returns a ParseResult then it is accepted even if is belongs to some other pattern

Simple example:

integer = Word("0123456789").add_parse_action(lambda t, i, s: int(t[0]))
result = integer.parse_string("42")
assert (result[0] == 42)

For slightly shorter specification, you may use the / operator and only parameters you need:

integer = Word("0123456789") / (lambda t: int(t[0]))
result = integer.parse_string("42")
assert (result[0] == 42)

Debugging

The PEG-style of mo-parsing (from pyparsing) makes a very expressible and readable specification, but debugging a parser is still hard. To look deeper into what the parser is doing use the Debugger:

with Debugger():
    expr.parse_string("my new language")

The debugger will print out details of what's happening

Each attempt, and if it matched or failed
A small number of bytes to show you the current position
location, line and column for more info about the current position
whitespace indicating stack depth
print out of the ParserElement performing the attempt

This should help to isolate the exact position your grammar is failing.

Regular Expressions

mo-parsing can parse and generate regular expressions. ParserElement has a __regex__() function that returns the regular expression for the given grammar; which works up to a limit, and is used internally to accelerate parsing. The Regex class parses regular expressions into a grammar; it is used to optimize parsing, and you may find it useful to decompose regular expressions that look like line noise.

Differences from PyParsing

This fork was originally created to support faster parsing for mo-sql-parsing. Since then it has deviated sufficiently to be it's own collection of parser specification functions. Here are the differences:

Added Whitespace, which controls parsing context and whitespace. It replaces the whitespace modifying methods of pyparsing
the wildcard ("*") could be used in pyparsing to indicate multi-values are expected; this is not allowed in mo-parsing: all values are multi-values
ParserElements are static: For example, expr.add_parse_action(action) creates a new ParserElement, so must be assigned to variable or it is lost. This is the biggest source of bugs when converting from pyparsing
removed all backward-compatibility settings
no support for binary serialization (no pickle)

Faster Parsing

faster infix operator parsing (main reason for this fork)
ParseResults point to ParserElement for reduced size
regex used to reduce the number of failed parse attempts
packrat parser is not need
less stack used

Contributing

If you plan to extend or enhance this code, please see the README in the tests directory

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

8.581.24094

Apr 3, 2024

8.580.24088

Mar 28, 2024

8.499.24023

Jan 23, 2024

8.485.24016

Jan 16, 2024

8.474.24004

Jan 4, 2024

8.473.23362

Dec 28, 2023

8.465.23337

Dec 3, 2023

8.464.23337

Dec 3, 2023

8.463.23337

Dec 3, 2023

8.460.23319

Nov 15, 2023

8.443.23275

Oct 2, 2023

8.442.23275

Oct 2, 2023

8.430.23234

Aug 22, 2023

8.428.23214

Aug 2, 2023

8.423.23199

Jul 18, 2023

8.421.23188

Jul 7, 2023

8.408.23161

Jun 10, 2023

8.401.23144

May 24, 2023

8.376.23121

May 1, 2023

8.369.23104

Apr 14, 2023

8.366.23085

Mar 26, 2023

8.356.23062

Mar 3, 2023

8.341.23006

Jan 6, 2023

8.340.23006

Jan 6, 2023

8.327.22363

Dec 29, 2022

8.309.22362

Dec 28, 2022

8.308.22362

Dec 28, 2022

8.304.22354

Dec 20, 2022

8.303.22352

Dec 18, 2022

8.302.22350

Dec 16, 2022

8.301.22350

Dec 16, 2022

8.297.22344

Dec 10, 2022

8.294.22344

Dec 10, 2022

8.280.22341

Dec 7, 2022

8.266.22338

Dec 4, 2022

8.262.22323

Nov 19, 2022

8.233.22310

Nov 6, 2022

8.207.22283

Oct 10, 2022

8.204.22252

Sep 9, 2022

8.192.22229

Aug 17, 2022

8.189.22201

Jul 20, 2022

8.183.22158

Jun 7, 2022

8.182.22128

May 8, 2022

8.180.22128

May 8, 2022

8.175.22127

May 7, 2022

8.174.22127

May 7, 2022

8.173.22126

May 6, 2022

8.170.22121

May 1, 2022

8.157.22117

Apr 27, 2022

8.156.22110

Apr 20, 2022

8.145.22062

Mar 3, 2022

8.134.22048

Feb 17, 2022

8.132.22039

Feb 8, 2022

8.129.22027

Jan 27, 2022

8.127.22022

Jan 22, 2022

8.126.22022

Jan 22, 2022

8.125.22022

Jan 22, 2022

8.124.22022

Jan 22, 2022

8.107.22008

Jan 8, 2022

8.25.22004

Jan 4, 2022

8.24.21357

Dec 23, 2021

8.23.21357

Dec 23, 2021

8.17.21356

Dec 22, 2021

8.14.21350

Dec 16, 2021

8.11.21349

Dec 15, 2021

8.10.21343

Dec 9, 2021

8.6.21337

Dec 3, 2021

8.4.21326

Nov 22, 2021

8.3.21324

Nov 20, 2021

8.1.21317

Nov 13, 2021

7.4.21313

Nov 9, 2021

7.1.21303

Oct 30, 2021

6.6.21303 yanked

Oct 30, 2021

Reason this release was yanked:

breaking version

6.4.21303

Oct 30, 2021

6.1.21289

Oct 16, 2021

5.62.21283

Oct 10, 2021

5.61.21283

Oct 10, 2021

5.60.21283

Oct 10, 2021

5.57.21262

Sep 19, 2021

5.56.21258

Sep 15, 2021

5.54.21254

Sep 11, 2021

5.40.21239

Aug 27, 2021

5.39.21239

Aug 27, 2021

4.13.21026

Jan 26, 2021

4.12.21016

Jan 16, 2021

3.138.20319

Nov 14, 2020

3.126.20294

Oct 20, 2020

3.111.20292

Oct 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mo-parsing-8.581.24094.tar.gz (59.1 kB view hashes)

Uploaded Apr 3, 2024 Source

Built Distribution

mo_parsing-8.581.24094-py3-none-any.whl (62.5 kB view hashes)

Uploaded Apr 3, 2024 Python 3

Hashes for mo-parsing-8.581.24094.tar.gz

Hashes for mo-parsing-8.581.24094.tar.gz
Algorithm	Hash digest
SHA256	`388031e63611854d78818daa96b60c7c83ecbec2bf4e13f6f442f26c78c42702`
MD5	`c836f4d5d0b52ac4b787b15cbbbf76c9`
BLAKE2b-256	`44fec1ee1f9410474d34b3218d695684c0d993a7c1f9f00c06dd8b11cece224c`

Hashes for mo_parsing-8.581.24094-py3-none-any.whl

Hashes for mo_parsing-8.581.24094-py3-none-any.whl
Algorithm	Hash digest
SHA256	`65e460e33b282a6e20d8aec23e602ec550ebf58bc6a3c502ac6a82c8d89d154e`
MD5	`84eeab4bffd63baca68bb03abf43eae1`
BLAKE2b-256	`478180727e843757596072498ea63fbe62543c967cdd58140c34adc61c4c2705`

mo-parsing 8.581.24094

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

More Parsing!

Installation

Usage

The `Whitespace` Context

Navigating ParseResults

Parse Actions

Simple example:

Debugging

Regular Expressions

Differences from PyParsing

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

mo-parsing 8.581.24094

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

More Parsing!

Installation

Usage

The Whitespace Context

Navigating ParseResults

Parse Actions

Simple example:

Debugging

Regular Expressions

Differences from PyParsing

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

The `Whitespace` Context