Skip to main content

Simple parser combinator made in Python

Project description

Pyrsec

Simple parser combinator made in Python

In the journey of creating a parser combinator in python while being as type safe as possible we are here now. I don't recommend you use this for anything important but for exploration and fun. This library is a mostly undocumented, bare bone implementation of a parser combinator, no error recovery is currently in place, only None is returned in case the parser can't continue. I basically started with a minimum implementation while adding a basic json parser as a test and kept adding functionality as needed.

A Json parser as an example

See the tests to convince yourself if it will work.

If you paste this in an editor with type linting support you might be able to hover over the variables to see their types.

from pyrsec import Parsec

# Recursive type alias 👀. See how we will not parse `floats` here.
JSON = Union[bool, int, None, str, List["JSON"], Dict[str, "JSON"]]

# To be defined later
json: Parsec[JSON]

# For recursive parsers like `list_` and `dict_`
deferred_json_ = Parsec.from_deferred(lambda: json)

# Basic values
true = Parsec.from_string("true").map(lambda _: True)
false = Parsec.from_string("false").map(lambda _: False)
null = Parsec.from_string("null").map(lambda _: None)
number = Parsec.from_re(re.compile(r"-?\d+")).map(int)

quote = Parsec.from_string('"').ignore()
string = quote >> Parsec.from_re(re.compile(r"[^\"]*")) << quote

# Space is always optional on json, that's way the `*` in the regular expression.
# Ignore is only to take a `Parsec[_T]` to a `Parsec[None]` by ignoring its consumed
# value.
space = Parsec.from_re(re.compile(r"\s*")).ignore()
comma = Parsec.from_string(",").ignore()

opened_square_bracket = Parsec.from_string("[")
closed_square_bracket = Parsec.from_string("]")

# Operator overloading is pretty handy here `|, &, >>, and <<` were overloaded to
# express pretty much what you already expect from them.
# If you use `a | b` you will get `a or b`.
# If you use `a & b` you will get `a & b`.
# If you use `a >> b` it will discard the left side after parsing, and the equivalent
# for `<<`.
list_ = (
    opened_square_bracket
    >> Parsec.sep_by(
        comma,
        deferred_json_,  # See the use of the recursive json parser?
    )
    << closed_square_bracket
)

opened_bracket = Parsec.from_string("{").ignore()
closed_bracket = Parsec.from_string("}").ignore()
colon = Parsec.from_string(":").ignore()

pair = ((space >> string << space) << colon) & deferred_json_

dict_ = (
    opened_bracket
    >> Parsec.sep_by(
        comma,
        pair,
    ).map(lambda xs: dict(xs))  # With only `dict` the linter goes crazy, idk.
    << closed_bracket
)

json = space >> (true | false | number | null | string | list_ | dict_) << space

json(
    """
{
    "json_parser": true
}
"""
)  # ({ 'json_parser': True }, '')

Enjoy!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrsec-0.1.1.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

pyrsec-0.1.1-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file pyrsec-0.1.1.tar.gz.

File metadata

  • Download URL: pyrsec-0.1.1.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyrsec-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3a29335677b8d00750f5a4f0fe4fefe67beb8b64a32bbaf852ea4a600c9f6e4f
MD5 ac32d50498ccb2205662600e1b273dbc
BLAKE2b-256 1f59fa7edbe0e594474a19715645bbd281c10685f179bea78100c88ae927114a

See more details on using hashes here.

File details

Details for the file pyrsec-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pyrsec-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyrsec-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3329ba0a42f31af9a6813da0f04c85f4c2207e0a076f754bebcff1856789dc71
MD5 0cfb61e6a81c36a2388c567bef1c9da7
BLAKE2b-256 c76f335585fc8e5eac63b240f87a628cefc3f21642084db6aeb92d6f25bd495b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page