Skip to main content

Persil is a pure-python parsing library, inspired by Parsy

Project description

Persil

Persil is a pure-python parsing library that draws much (most, let's be honest) of its inspiration from the excellent Parsy library.

Hence the name, "Persil" ([pɛʁ.sil] or [pɛʁ.si]), the French word for parsley -a most questionable pun on Parsy -> Parsley -> Persil, in case anyone missed it.

Like Parsy, Persil is a "monadic parser combinator library for LL(infinity) grammars". As a rough approximation, you can think of Persil as a typed "fork" of Parsy. However, although the APIs are very similar, there are notable differences that you might want to review if you're coming from Parsy.

If you're merely looking for a somewhat type-aware version of Parsy, you may be looking for parsy-stubs. Mypy can use it to infer most of the types, but you'll find that shortcuts had to be taken in many cases.

Getting started

Persil is a pure-Python library. You can install it with pip:

pip install git+https://github.com/bdura/persil

Then, you can play with persil much the same way you would with Parsy, and enjoy the great developer experience that type-hinting brings to Persil.

A basic example

from persil import regex

year = regex(r"\d{4}").map(int)

More complex parsers

Parsy uses generator functions as a most elegant solution to define complex parser.

While you can still use this approach with Persil, you're encouraged to favour the from_streamer decorator:

@from_streamer
def parser(
    stream: Stream[str],
) -> CustomType:
    a = stream(parser_a)
    b = stream(parser_b)
    c = stream(parser_c)

    return CustomType(a, b, c)

The equivalent code, using generate instead (deprecated in Persil):

@generate
def parser() -> Generator[Parser, Any, CustomType]:
    a = yield parser_a
    b = yield parser_b
    c = yield parser_c

    return CustomType(a, b, c)

The main issue with generate is that intermediate parsers cannot be typed, whereas Stream.__call__ plays nice with modern Python tooling like mypy.

Relation with Parsy

First of all, I am not affiliated in any way with the Parsy project.

Rationale

Parsy's last commit is from a year ago at the time of writing. Moreover, although the authors have started some development to propose a typed version of their library, efforts in that area have stalled for two years.

Compatibility with Parsy

Although Persil draws most of its inspiration from Parsy, maintaining a one-for-one equivalence with the latter's API is NOT among Persil's goal.

For those coming from Parsy, here are some notable differences:

  • the Result type is now a union between Ok and Err, which allow for a more type-safe API.
  • Err is its own error: it inherits from Exception and can be raised.
  • Persil introduces the Stream class, a wrapper around the input that can apply parsers sequentially, keeping track of the book-keeping.

Performance tips

Since Persil takes a functional approach, every transformation on a parser produces a new parser. With that in mind, the way you define/use/combine parsers may substantially affect performance.

Consider the following example:

from datetime import datetime

from persil import Stream, from_stream, regex, string


@from_stream
def datetime_parser(stream: Stream[str]) -> datetime:
    year = stream.apply(regex(r"\d{4}").map(int))
    stream.apply(string("/"))
    month = stream.apply(regex(r"\d{2}").map(int))
    stream.apply(string("/"))
    day = stream.apply(regex(r"\d{2}").map(int))
    return datetime(year, month, day)

The resulting datetime_parser will re-create three new regex parsers every time it is run.

A much better alternative:

from datetime import datetime

from persil import Stream, from_stream, regex, string


year_parser = regex(r"\d{4}").map(int)
day_month_parser = regex(r"\d{2}").map(int)
slash_parser = string("/")

@from_stream
def datetime_parser(stream: Stream[str]) -> datetime:
    year = stream.apply(year_parser)
    stream.apply(slash_parser)
    month = stream.apply(day_month_parser)
    stream.apply(slash_parser)
    day = stream.apply(day_month_parser)
    return datetime(year, month, day)

That way, the parsers are only defined once.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

persil-0.1.0a0.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

persil-0.1.0a0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file persil-0.1.0a0.tar.gz.

File metadata

  • Download URL: persil-0.1.0a0.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for persil-0.1.0a0.tar.gz
Algorithm Hash digest
SHA256 8bd1eec0fd3034b1fef7c7be386ea4b7709bb5e7f381bd0ae4db4fd3765bc69d
MD5 d61982926660e73aa6a94d788d62fa53
BLAKE2b-256 6360067a05866ce31d6ec0e07d4d83164ee575c817917b843d792fcb1709faa6

See more details on using hashes here.

File details

Details for the file persil-0.1.0a0-py3-none-any.whl.

File metadata

  • Download URL: persil-0.1.0a0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for persil-0.1.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ef7c28e80bd3d640c66e03ea517c108fbdfe0cbe41f8838e563137bdd41ce0f
MD5 645c269bf9c9d944e53ef82184cd51b2
BLAKE2b-256 eee420918af0dc0800073ddd6be372f5855655cb8bcbc43806d1fadcc87db7cc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page