Skip to main content

Simple parser combinator made in Python

Project description

Pyrsec

Simple parser combinator made in Python

PyPI PyPI - License codecov

In the journey of creating a parser combinator in python while being as type safe as possible we are here now. I don't recommend you use this for anything important but for exploration and fun. This library is a mostly undocumented, bare bone implementation of a parser combinator, no error recovery is currently in place, only None is returned in case the parser can't continue. I basically started with a minimum implementation while adding a basic json parser as a test and kept adding functionality as needed.

A Json parser as an example

See the tests to convince yourself if it will work.

If you paste this in an editor with type linting support you might be able to hover over the variables to see their types.

from pyrsec import Parsec

# Recursive type alias 👀. See how we will not parse `floats` here.
JSON = Union[bool, int, None, str, List["JSON"], Dict[str, "JSON"]]

# To be defined later
json: Parsec[JSON]

# For recursive parsers like `list_` and `dict_`
deferred_json_ = Parsec.from_deferred(lambda: json)

# Basic values
true = Parsec.from_string("true").map(lambda _: True)
false = Parsec.from_string("false").map(lambda _: False)
null = Parsec.from_string("null").map(lambda _: None)
number = Parsec.from_re(re.compile(r"-?\d+")).map(int)

quote = Parsec.from_string('"').ignore()
string = quote >> Parsec.from_re(re.compile(r"[^\"]*")) << quote

# Space is always optional on json, that's way the `*` in the regular expression.
# Ignore is only to take a `Parsec[_T]` to a `Parsec[None]` by ignoring its consumed
# value.
space = Parsec.from_re(re.compile(r"\s*")).ignore()
comma = Parsec.from_string(",").ignore()

opened_square_bracket = Parsec.from_string("[")
closed_square_bracket = Parsec.from_string("]")

# Operator overloading is very handy here `|, &, >>, and <<` were overloaded to
# express pretty much what you already expect from them.
# If you use `a | b` you will get `a or b`.
# If you use `a & b` you will get `a & b`.
# If you use `a >> b` it will discard the left side after parsing, and the equivalent
# for `<<`.

list_ = (
    opened_square_bracket
    >> (deferred_json_.sep_by(comma))  # See the use of the recursive json parser?
    << closed_square_bracket
)

opened_bracket = Parsec.from_string("{").ignore()
closed_bracket = Parsec.from_string("}").ignore()
colon = Parsec.from_string(":").ignore()

pair = ((space >> string << space) << colon) & deferred_json_

dict_ = (
    opened_bracket >> pair.sep_by(comma).map(lambda xs: dict(xs)) << closed_bracket
)

json = space >> (true | false | number | null | string | list_ | dict_) << space

json(
    """
{
    "json_parser": true
}
"""
)  # ({ 'json_parser': True }, '')

Enjoy!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrsec-0.2.0.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

pyrsec-0.2.0-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file pyrsec-0.2.0.tar.gz.

File metadata

  • Download URL: pyrsec-0.2.0.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for pyrsec-0.2.0.tar.gz
Algorithm Hash digest
SHA256 774812ba3fad9e10a4c9ea6bd6ab073eae1249f1cc042b59b5bc14faa670e7b0
MD5 a723bd82bfccec5281f921fec5d24d32
BLAKE2b-256 4d0522f7b2d458e5cdaac68b05aaab6e433f6bfa9f37ad5ef561688c921bf458

See more details on using hashes here.

File details

Details for the file pyrsec-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyrsec-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for pyrsec-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f50a1752aa9ca476363a49b9caf22b8082c2d41b9869d00a3839820e37ac644a
MD5 ef03971e6cbb9c83b6bb2c2e65679a94
BLAKE2b-256 6b0027b1d8073cbbf6623beea645f4f9b4bd860b932e95fdc18547d95171fe1a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page