Natural eXpression Parsing — A Python 3 parsing library.
Project description
NXP: Natural eXpression Parsing
NXP is a parsing library written in Python 3, inspired by pyparsing and Microsoft Monarch.
It allows users to do two things:
- Define text patterns by combining Python objects, instead of writing complicated regular expressions.
- Define and parse complex languages, with a simple dictionary!
Can it be that simple, you ask?
Don't take my word for it; check out the example below, and see for yourself. :blush:
Example: simple LaTeX-like language
We want to parse the following file, say foo.txt
, which contains LaTeX-like patterns \command{ body }
:
Inspirational quote:
\quote{
Time you enjoy wasting is \it{not} wasted time.
}
Command without a body \command, or with an empty one \command{}.
NXP allows you to easily define a language to match such patterns:
import nxp
# define these rules separately so they can be re-used
backslash = [ r'\\\\', ('rep','\\') ]
command = [ r'\\(\w+)', ('open','command'), ('tag','cmd') ]
# create a parser
parser = nxp.make_parser({
'lang': {
'main': [
backslash, # replace escaped backslashes
command # open "command" scope if we find something like '\word'
],
'command': { # the "command" scope
'main': [
[ r'\{', ('open','command.body'), ('tag','body') ],
# open "body" subscope if command is followed by '{'
[ None, 'close' ]
# otherwise close the scope
],
'body': [ # the "command.body" scope
backslash,
[ r'\\\{', ('rep','{') ],
[ r'\\\}', ('rep','}') ],
# deal with escapes before looking for a nested command
command,
# look for nested commands
[ r'\}', ('tag','/body'), ('close',2) ]
# the command ends when the body ends: close both scopes
]
}
}
})
print(nxp.parsefile( parser, 'foo.txt' ))
The output is a simple AST:
+ Scope("main"): 3 element(s)
[0] Scope("command"): 2 element(s)
[0] \\(\w+)
(0) (1, 0) - (1, 6) \quote
[1] Scope("command.body"): 3 element(s)
[0] \{
(0) (1, 6) - (1, 7) {
[1] Scope("command"): 2 element(s)
[0] \\(\w+)
(0) (2, 30) - (2, 33) \it
[1] Scope("command.body"): 2 element(s)
[0] \{
(0) (2, 33) - (2, 34) {
[1] \}
(0) (2, 37) - (2, 38) }
[2] \}
(0) (3, 0) - (3, 1) }
[1] Scope("command"): 1 element(s)
[0] \\(\w+)
(0) (5, 23) - (5, 31) \command
[2] Scope("command"): 2 element(s)
[0] \\(\w+)
(0) (5, 54) - (5, 62) \command
[1] Scope("command.body"): 2 element(s)
[0] \{
(0) (5, 62) - (5, 63) {
[1] \}
(0) (5, 63) - (5, 64) }
Note: begin/end positions are given in the format
(line,col)
, starting at 0 (not 1).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nxp-0.0.1.tar.gz
(22.5 kB
view hashes)
Built Distribution
nxp-0.0.1-py3-none-any.whl
(32.9 kB
view hashes)