A versatile token stream for handwritten parsers
Project description
tokenstream
A versatile token stream for handwritten parsers.
from tokenstream import TokenStream
def parse_sexp(stream: TokenStream):
"""A basic S-expression parser."""
with stream.syntax(brace=r"\(|\)", number=r"\d+", name=r"\w+"):
brace, number, name = stream.expect(("brace", "("), "number", "name")
if brace:
return [parse_sexp(stream) for _ in stream.peek_until(("brace", ")"))]
elif number:
return int(number.value)
elif name:
return name.value
print(parse_sexp(TokenStream("(hello (world 42))"))) # ['hello', ['world', 42]]
Introduction
Writing recursive-descent parsers by hand can be quite elegant but it's often a bit more verbose than expected. In particular, handling indentation and reporting proper syntax errors can be pretty challenging. This package provides a powerful general-purpose token stream that addresses these issues and more.
Features
- Define token types with regular expressions
- The set of recognizable tokens can be defined dynamically during parsing
- Transparently skip over irrelevant tokens
- Expressive API for matching, collecting, peeking, and expecting tokens
- Clean error reporting with line numbers and column numbers
- Natively understands indentation-based syntax
Installation
The package can be installed with pip
.
pip install tokenstream
Contributing
Contributions are welcome. Make sure to first open an issue discussing the problem or the new feature before creating a pull request. The project uses poetry
.
$ poetry install
You can run the tests with poetry run pytest
.
$ poetry run pytest
The project must type-check with pyright
. If you're using VSCode the pylance
extension should report diagnostics automatically. You can also install the type-checker locally with npm install
and run it from the command-line.
$ npm run watch
$ npm run check
$ npm run verifytypes
The code follows the black
code style. Import statements are sorted with isort
.
$ poetry run isort tokenstream examples tests
$ poetry run black tokenstream examples tests
$ poetry run black --check tokenstream examples tests
License - MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tokenstream-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fe2aa6d36edd2f4c9abb00aa03f20b8d76e1d19bd2777b5dc4535f8206c811f |
|
MD5 | f699e6123ab1be86a7f762de93055299 |
|
BLAKE2b-256 | cd75814f9357d17d4887ca70b7441a19fbd1f66ea100751a37f8fbfe44350f46 |