Skip to main content

A Monadic Parser Combinator for Python.

Project description

Parsec Ask DeepWiki

A Monadic Parser Combinator for Python.

License Typed Tests

Documents · Example · Issue


Parsec provides a declarative, modular way to build complex text parsers. By leveraging the powerful expressiveness of Monads, you can compose simple parsers like building blocks to handle complex grammar structures, while elegantly managing state and errors during the parsing process.

Unlike traditional parser generation tools (like Lark, PLY), Parsec allows you to define and combine your parsing logic directly within Python code, without the need for separate grammar files, making parser construction more flexible and dynamic.

Features

  • Monadic A Parser is a monad.
  • Declarative Declarative grammar definition
  • Operator Operator-based combinators(<<, >>, &, |, /, @)
  • Lazy Evalution Lazy evaluation for recursion
  • Curried Curried functional interfaces
  • Typed Support type inference

Intallation

Requirements

For full type support, Python 3.13+ is required.

Install from source code

Install from source code:

git clone https://github.com/lunexnocty/parsec-python.git
cd parsec-python
pip install .

Install from pypi:

pip install parsec-python

If you use uv:

uv add parsec-python

Quick Start

The parsec.text module provides a set of fundamental text parsers, such as number, char, blank, etc. The parsec.combinator module offers a rich, curried functional interface for combinators, enabling you to compose these basic parsers into powerful and expressive parsing logic.

To make parser composition even more intuitive, several operators have been overloaded:

  • p << f equivalent to f(p), enables successive application of combinators to a parser, supporting a piping style of composition.
  • p >> f equivalent to p.bind(f), represents the monadic bind operation, allowing the result of parser p to determine and sequence the next parser generated by function f. This enables context-sensitive and dependent parsing, as found in the Monad interface in functional programming.
  • p @ f equivalent to p.map(f), mapping the f over the result of the parser p. This corresponds to the Functor's map (or fmap) operation, allowing you to transform the output of a parser in a declarative and compositional way.
  • The & operator combines multiple parsers in sequence and collects their results into a tuple.
  • The | operator tries each parser in sequence and returns the first successful result.
  • The / operator is similar to |, but never backtracking.

Parsers can also be defined lazily to support recursive grammar definitions—essential for constructs like nested expressions or parentheses.

Below is an arithmetic expression grammar supporting operator precedence, parentheses, and left-associative chaining:

expr           := <add_or_sub_exp>
add_or_sub_exp := <mul_or_div_exp> (('+' | '-') <mul_or_div_exp>)*
mul_or_div_exp := <factor> (('*' | '/') <factor>)*
factor         := '(' <expr> ')' | <num>
num            := { number }

Based on the BNF description above, we can easily build a parser to analyze the language defined by that syntax.

from parsec import Parser, item, text
from parsec.text import lex
from parsec.utils import curry

def calc(op: str):
    @curry
    def _(x: int | float, y: int | float):
        return {'+': x + y, '-': x - y, '*': x * y, '/': x / y}[op]
    return _

expr = Parser[str, int | float]()
factor = expr.between(lex.l_round, lex.r_round) | lex.number
mul_or_div = lex.char('*') | lex.char('/')  # operator `|`
mul_or_div_op = mul_or_div @ calc  # operator `@`
mul_or_div_expr = factor.chainl1(mul_or_div_op)  # left associativity
add_or_sub = item.range('+-') << lex.lexeme(text.blank)  # `range` combinator
add_or_sub_op = add_or_sub @ calc
add_or_sub_expr = mul_or_div_expr.chainl1(add_or_sub_op)
expr.define(add_or_sub_expr)  # lazy definition

src = '(1. + .2e-1) * 100 - 1 / 2.5 '
assert text.parse(expr, src) == eval(src)  # True

The parsec.combinator.chainl1 combinator handles left-associative chaining of operations, parsing one or more occurrences of a parser p separated by an operator parser, and combining results in a left-associative manner.

This approach is highly extensible: you can add additional operators, functions, or syntax features by composing and reusing combinators.

Architecture

A parser is a function that takes a Context[I] as input and returns a Result[I, R], where I and R are generic type parameters. Here, I represents the type of each element in the input stream, and R denotes the type of the value produced by the parser.

parser[I, R]: Context[I] -> Result[I, R]

The parsing function is wrapped in the Parser[I, R] class, endowing it with a monadic interface for functional composition and declarative parsing.

class Parser[I, R]:
  def bind[S](self, fn: R -> Parser[I, S]) -> Parser[I, S]: ...
  def okay(self, value: R) -> Parser[I, R]: ...
  def fail(self, errpr: ParseErr) -> Parser[I, R]: ...

A Context[I] consists of two primary components: a stream[I], which provides access to the underlying input sequence, and a State[I], which manages auxiliary parsing state. If you need to parse data types other than text, you can extend IStream and IState to implement custom stream and state management logic, enabling the parsing of arbitrary data sequences.

class Context[I]:
  stream: IStream[I]
  state: IState[I]

The Result[I, R] type represents the outcome of a parsing operation, containing the updated context, the parsing result (either a successfully parsed value or an error), and the number of input elements consumed during parsing.

class Result[I, R]:
  context: Context[I]
  outcome: Okay[R] | Fail
  consumed: int

The combinator module provides a curried functional interface for composing parsers, and also supports method chaining. Both styles are equivalent in expressive power.

Parsec do not support left-recursive grammars.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsec_python-0.1.1.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parsec_python-0.1.1-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file parsec_python-0.1.1.tar.gz.

File metadata

  • Download URL: parsec_python-0.1.1.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for parsec_python-0.1.1.tar.gz
Algorithm Hash digest
SHA256 edc4a1d391457babddd83c0374b5ff01b31bb8b4a20007303092c62a7fa1e2a9
MD5 eaa0708e14a22380e84654cbcb951aca
BLAKE2b-256 c2b0320847b6ec18a3c0ed28d85b3ed6e8f905f042ec4a73b8e7a39a66d89258

See more details on using hashes here.

File details

Details for the file parsec_python-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: parsec_python-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 25.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for parsec_python-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5d375745d842fbb2b41ad313d2701442e6e3e21c4c685531195443dddbd473f3
MD5 a6ffd23e625465d3e4531fd4387ba38a
BLAKE2b-256 0f88e1c9cfa8d3f6aea91927146219304e5549d5d50355934af8b23813d1ec5b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page