Skip to main content

Simple parser combinator made in Python

Project description

Pyrsec

Simple parser combinator made in Python

PyPI PyPI - License codecov

In the journey of creating a parser combinator in python while being as type safe as possible we are here now. I don't recommend you use this for anything important but for exploration and fun. This library is a mostly undocumented, bare bone implementation of a parser combinator, no error recovery is currently in place, only None is returned in case the parser can't continue. I basically started with a minimum implementation while adding a basic json parser as a test and kept adding functionality as needed.

A Json parser as an example

See the tests to convince yourself if it will work.

If you paste this in an editor with type linting support you might be able to hover over the variables to see their types.

from pyrsec import Parsec

# Recursive type alias 👀. See how we will not parse `floats` here.
JSON = Union[bool, int, None, str, List["JSON"], Dict[str, "JSON"]]

# To be defined later
json: Parsec[JSON]

# For recursive parsers like `list_` and `dict_`
deferred_json_ = Parsec.from_deferred(lambda: json)

# Basic values
true = Parsec.from_string("true").map(lambda _: True)
false = Parsec.from_string("false").map(lambda _: False)
null = Parsec.from_string("null").map(lambda _: None)
number = Parsec.from_re(re.compile(r"-?\d+")).map(int)

quote = Parsec.from_string('"').ignore()
string = quote >> Parsec.from_re(re.compile(r"[^\"]*")) << quote

# Space is always optional on json, that's way the `*` in the regular expression.
# Ignore is only to take a `Parsec[_T]` to a `Parsec[None]` by ignoring its consumed
# value.
space = Parsec.from_re(re.compile(r"\s*")).ignore()
comma = Parsec.from_string(",").ignore()

opened_square_bracket = Parsec.from_string("[")
closed_square_bracket = Parsec.from_string("]")

# Operator overloading is very handy here `|, &, >>, and <<` were overloaded to
# express pretty much what you already expect from them.
# If you use `a | b` you will get `a or b`.
# If you use `a & b` you will get `a & b`.
# If you use `a >> b` it will discard the left side after parsing, and the equivalent
# for `<<`.

list_ = (
    opened_square_bracket
    >> (deferred_json_.sep_by(comma))  # See the use of the recursive json parser?
    << closed_square_bracket
)

opened_bracket = Parsec.from_string("{").ignore()
closed_bracket = Parsec.from_string("}").ignore()
colon = Parsec.from_string(":").ignore()

pair = ((space >> string << space) << colon) & deferred_json_

dict_ = (
    opened_bracket >> pair.sep_by(comma).map(lambda xs: dict(xs)) << closed_bracket
)

json = space >> (true | false | number | null | string | list_ | dict_) << space

json(
    """
{
    "json_parser": true
}
"""
)  # ({ 'json_parser': True }, '')

Enjoy!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrsec-0.2.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

pyrsec-0.2.1-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file pyrsec-0.2.1.tar.gz.

File metadata

  • Download URL: pyrsec-0.2.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for pyrsec-0.2.1.tar.gz
Algorithm Hash digest
SHA256 5624e358518fff2c4037e8c01c6dff5a334b650b6c1a2011296679b1ef967c4f
MD5 bb7ee7c91ca8d339a19e854328ea13d0
BLAKE2b-256 dd20ae7ae257498438fb45677c2772922d1061b9193411b08e11a83acdb8746e

See more details on using hashes here.

File details

Details for the file pyrsec-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: pyrsec-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for pyrsec-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3f30421759e5e358b5931f58b737e878d21b886ff7b272765fd12997525cb065
MD5 2b271b6abb5e8657ef81cd3fae1cdeac
BLAKE2b-256 38f70b24ff6cc6d192986dbfc4baced24b81f3e762d075ab098ca7501ad1cb6d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page