Skip to main content

Simple parser combinator made in Python

Project description

Pyrsec

Simple parser combinator made in Python

In the journey of creating a parser combinator in python while being as type safe as possible we are here now. I don't recommend you use this for anything important but for exploration and fun. This library is a mostly undocumented, bare bone implementation of a parser combinator, no error recovery is currently in place, only None is returned in case the parser can't continue. I basically started with a minimum implementation while adding a basic json parser as a test and kept adding functionality as needed.

A Json parser as an example

See the tests to convince yourself if it will work.

If you paste this in an editor with type linting support you might be able to hover over the variables to see their types.

from pyrsec import Parsec

# Recursive type alias 👀. See how we will not parse `floats` here.
JSON = Union[bool, int, None, str, List["JSON"], Dict[str, "JSON"]]

# To be defined later
json: Parsec[JSON]

# For recursive parsers like `list_` and `dict_`
deferred_json_ = Parsec.from_deferred(lambda: json)

# Basic values
true = Parsec.from_string("true").map(lambda _: True)
false = Parsec.from_string("false").map(lambda _: False)
null = Parsec.from_string("null").map(lambda _: None)
number = Parsec.from_re(re.compile(r"-?\d+")).map(int)

quote = Parsec.from_string('"').ignore()
string = quote >> Parsec.from_re(re.compile(r"[^\"]*")) << quote

# Space is always optional on json, that's way the `*` in the regular expression.
# Ignore is only to take a `Parsec[_T]` to a `Parsec[None]` by ignoring its consumed value.
space = Parsec.from_re(re.compile(r"\s*")).ignore()
comma = Parsec.from_string(",").ignore()

opened_square_bracket = Parsec.from_string("[")
closed_square_bracket = Parsec.from_string("]")

# Operator overloading is pretty handy here `|, &, >>, and <<` were overloaded to express pretty much
# what you already expect from them.
# If you use `a | b` you will get `a or b`.
# If you use `a & b` you will get `a & b`.
# If you use `a >> b` it will discard the left side after parsing, and the equivalent for `<<`.
list_ = (
    opened_square_bracket
    >> Parsec.sep_by(
        comma,
        deferred_json_,  # See the use of the recursive json parser?
    )
    << closed_square_bracket
)

opened_bracket = Parsec.from_string("{").ignore()
closed_bracket = Parsec.from_string("}").ignore()
colon = Parsec.from_string(":").ignore()

pair = ((space >> string << space) << colon) & deferred_json_

dict_ = (
    opened_bracket
    >> Parsec.sep_by(
        comma,
        pair,
    ).map(lambda xs: dict(xs))  # With only `dict` the linter goes crazy, idk.
    << closed_bracket
)

json = space >> (true | false | number | null | string | list_ | dict_) << space

json(
    """
{
    "json_parser": true
}
"""
)  # ({ 'json_parser': True }, '')

Enjoy!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrsec-0.1.0.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

pyrsec-0.1.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file pyrsec-0.1.0.tar.gz.

File metadata

  • Download URL: pyrsec-0.1.0.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for pyrsec-0.1.0.tar.gz
Algorithm Hash digest
SHA256 46633789838f2d823e7b0b6a7a8aeb696e48246b87d5a3135f9061ff6b4da739
MD5 b9d5eb141d8b834a5b07144c1064cc0c
BLAKE2b-256 f17327f281e6b945629354c66e29abf92a0429abacadca668e0ee2a220c37828

See more details on using hashes here.

File details

Details for the file pyrsec-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyrsec-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for pyrsec-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d82eee9853b4bc5defb46902a9ad36ff3fe8a7f4b58847adca83f687646b2470
MD5 306319f680cbeb3902c3e48411703b13
BLAKE2b-256 8bad391ca9958077e12c778d166c77375c596062daf3e07d6db96694cd40e062

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page