Simple parser combinator made in Python
Project description
Pyrsec
Simple parser combinator made in Python
In the journey of creating a parser combinator in python while being as type safe as possible we are here now.
I don't recommend you use this for anything important but for exploration and fun.
This library is a mostly undocumented, bare bone implementation of a parser combinator, no error recovery is
currently in place, only None
is returned in case the parser can't continue.
I basically started with a minimum implementation while adding a basic json
parser as a test and kept adding
functionality as needed.
A Json parser as an example
See the tests to convince yourself if it will work.
If you paste this in an editor with type linting support you might be able to hover over the variables to see their types.
from pyrsec import Parsec
# Recursive type alias 👀. See how we will not parse `floats` here.
JSON = Union[bool, int, None, str, List["JSON"], Dict[str, "JSON"]]
# To be defined later
json: Parsec[JSON]
# For recursive parsers like `list_` and `dict_`
deferred_json_ = Parsec.from_deferred(lambda: json)
# Basic values
true = Parsec.from_string("true").map(lambda _: True)
false = Parsec.from_string("false").map(lambda _: False)
null = Parsec.from_string("null").map(lambda _: None)
number = Parsec.from_re(re.compile(r"-?\d+")).map(int)
quote = Parsec.from_string('"').ignore()
string = quote >> Parsec.from_re(re.compile(r"[^\"]*")) << quote
# Space is always optional on json, that's way the `*` in the regular expression.
# Ignore is only to take a `Parsec[_T]` to a `Parsec[None]` by ignoring its consumed value.
space = Parsec.from_re(re.compile(r"\s*")).ignore()
comma = Parsec.from_string(",").ignore()
opened_square_bracket = Parsec.from_string("[")
closed_square_bracket = Parsec.from_string("]")
# Operator overloading is pretty handy here `|, &, >>, and <<` were overloaded to express pretty much
# what you already expect from them.
# If you use `a | b` you will get `a or b`.
# If you use `a & b` you will get `a & b`.
# If you use `a >> b` it will discard the left side after parsing, and the equivalent for `<<`.
list_ = (
opened_square_bracket
>> Parsec.sep_by(
comma,
deferred_json_, # See the use of the recursive json parser?
)
<< closed_square_bracket
)
opened_bracket = Parsec.from_string("{").ignore()
closed_bracket = Parsec.from_string("}").ignore()
colon = Parsec.from_string(":").ignore()
pair = ((space >> string << space) << colon) & deferred_json_
dict_ = (
opened_bracket
>> Parsec.sep_by(
comma,
pair,
).map(lambda xs: dict(xs)) # With only `dict` the linter goes crazy, idk.
<< closed_bracket
)
json = space >> (true | false | number | null | string | list_ | dict_) << space
json(
"""
{
"json_parser": true
}
"""
) # ({ 'json_parser': True }, '')
Enjoy!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyrsec-0.1.0.tar.gz
.
File metadata
- Download URL: pyrsec-0.1.0.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46633789838f2d823e7b0b6a7a8aeb696e48246b87d5a3135f9061ff6b4da739 |
|
MD5 | b9d5eb141d8b834a5b07144c1064cc0c |
|
BLAKE2b-256 | f17327f281e6b945629354c66e29abf92a0429abacadca668e0ee2a220c37828 |
File details
Details for the file pyrsec-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: pyrsec-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d82eee9853b4bc5defb46902a9ad36ff3fe8a7f4b58847adca83f687646b2470 |
|
MD5 | 306319f680cbeb3902c3e48411703b13 |
|
BLAKE2b-256 | 8bad391ca9958077e12c778d166c77375c596062daf3e07d6db96694cd40e062 |