A simple API to scan and tokenize text for the purpose of structured language processing.
Project description
nr.parsing.core
The nr.parsing.core
package provides a simple API to scan and tokenize text for the purpose of
structured langauge processing.
Example
from nr.parsing.core import RuleSet, Tokenizer, rules
ruleset = RuleSet()
ruleset.rule('number', rules.regex_extract(r'\-?(0|[1-9]\d*)', 0))
ruleset.rule('operator', rules.regex_extract(r'[\-\+]', 0))
ruleset.rule('whitespace', rules.regex(r'\s+'), skip=True)
def calculate(expr: str) -> int:
tokenizer = Tokenizer(ruleset, expr)
result = 0
sign: t.Optional[int] = 1
while tokenizer:
if tokenizer.current.type != 'number':
raise ValueError(f'unexpected token {tokenizer.current}')
assert sign is not None
result += sign * int(tokenizer.current.value)
tokenizer.next()
if tokenizer.current.type == 'operator':
sign = -1 if tokenizer.current.value == '-' else 1
tokenizer.next()
else:
sign = None
if sign is not None:
raise ValueError(f'unexpected trailing operator')
return result
assert calculate('3 + 5 - 1') == 7
Copyright © 2020 Niklas Rosenstein
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nr.parsing.core-2.0.4.tar.gz
(11.7 kB
view hashes)
Built Distribution
Close
Hashes for nr.parsing.core-2.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c2eb6f045dbce3e716882e41f882bf4a224324719687abb5f1c34d79437e116 |
|
MD5 | 547a3b07436b9ba2a6e8c7d4a626e21f |
|
BLAKE2b-256 | cf8611d7a7aacd7f8805022235128c39fa492812a254ff08ad06b01990e08fe3 |