A simple API to scan and tokenize text for the purpose of structured language processing.
Project description
nr.parsing.core
The nr.parsing.core
package provides a simple API to scan and tokenize text for the purpose of
structured langauge processing.
Example
from nr.parsing.core import RuleSet, Tokenizer, rules
ruleset = RuleSet()
ruleset.rule('number', rules.regex_extract(r'\-?(0|[1-9]\d*)', 0))
ruleset.rule('operator', rules.regex_extract(r'[\-\+]', 0))
ruleset.rule('whitespace', rules.regex(r'\s+'), skip=True)
def calculate(expr: str) -> int:
tokenizer = Tokenizer(ruleset, expr)
result = 0
sign: t.Optional[int] = 1
while tokenizer:
if tokenizer.current.type != 'number':
raise ValueError(f'unexpected token {tokenizer.current}')
assert sign is not None
result += sign * int(tokenizer.current.value)
tokenizer.next()
if tokenizer.current.type == 'operator':
sign = -1 if tokenizer.current.value == '-' else 1
tokenizer.next()
else:
sign = None
if sign is not None:
raise ValueError(f'unexpected trailing operator')
return result
assert calculate('3 + 5 - 1') == 7
Copyright © 2020 Niklas Rosenstein
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nr.parsing.core-2.0.3.tar.gz
(11.7 kB
view hashes)
Built Distribution
Close
Hashes for nr.parsing.core-2.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3485e3b571a780e46659741c5aa73fb15042b3f9bc71681b66cd8e6ff49979ac |
|
MD5 | 73e94fb578a262e2e3fc0afb466e7747 |
|
BLAKE2b-256 | 06af26c828ea61ec23ab997adba067f06b3cd1d13fe848a9860fb3bee328cde5 |