Skip to main content

Natural eXpression Parsing — A Python 3 parsing library.

Project description

License: MPLv2 Documentation

NXP: Natural eXpression Parsing

NXP is a parsing library written in Python 3, inspired by pyparsing and Microsoft Monarch.

It allows users to do two things:

  • Define text patterns by combining Python objects, instead of writing complicated regular expressions.
  • Define and parse complex languages, with a simple dictionary!

Can it be that simple, you ask?
Don't take my word for it; check out the example below, and see for yourself. :blush:

Example: simple LaTeX-like language

We want to parse the following file, say foo.txt, which contains LaTeX-like patterns \command{ body }:

Inspirational quote:
\quote{
    Time you enjoy wasting is \it{not} wasted time.
}

Command without a body \command, or with an empty one \command{}.

NXP allows you to easily define a language to match such patterns:

import nxp

# define these rules separately so they can be re-used
backslash = [ r'\\\\', ('rep','\\') ] 
command = [ r'\\(\w+)', ('open','command'), ('tag','cmd') ] 

# create a parser
parser = nxp.make_parser({
	'lang': {
		'main': [
			backslash,  # replace escaped backslashes
			command     # open "command" scope if we find something like '\word'
		],
		'command': { # the "command" scope
			'main': [
				[ r'\{', ('open','command.body'), ('tag','body') ],
					# open "body" subscope if command is followed by '{'
				[ None, 'close' ] 
					# otherwise close the scope
			],
			'body': [ # the "command.body" scope
				backslash,
				[ r'\\\{', ('rep','{') ],
				[ r'\\\}', ('rep','}') ],
					# deal with escapes before looking for a nested command
				command, 
					# look for nested commands
				[ r'\}', ('tag','/body'), ('close',2) ]
					# the command ends when the body ends: close both scopes
			]
		}
	}
})

print(nxp.parsefile( parser, 'foo.txt' ))

The output is a simple AST:

+ Scope("main"): 3 element(s)
	[0] Scope("command"): 2 element(s)
		[0] \\(\w+)
			(0) (1, 0) - (1, 6) \quote
		[1] Scope("command.body"): 3 element(s)
			[0] \{
				(0) (1, 6) - (1, 7) {
			[1] Scope("command"): 2 element(s)
				[0] \\(\w+)
					(0) (2, 30) - (2, 33) \it
				[1] Scope("command.body"): 2 element(s)
					[0] \{
						(0) (2, 33) - (2, 34) {
					[1] \}
						(0) (2, 37) - (2, 38) }
			[2] \}
				(0) (3, 0) - (3, 1) }
	[1] Scope("command"): 1 element(s)
		[0] \\(\w+)
			(0) (5, 23) - (5, 31) \command
	[2] Scope("command"): 2 element(s)
		[0] \\(\w+)
			(0) (5, 54) - (5, 62) \command
		[1] Scope("command.body"): 2 element(s)
			[0] \{
				(0) (5, 62) - (5, 63) {
			[1] \}
				(0) (5, 63) - (5, 64) }

Note: begin/end positions are given in the format (line,col), starting at 0 (not 1).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nxp-0.0.1.tar.gz (22.5 kB view hashes)

Uploaded Source

Built Distribution

nxp-0.0.1-py3-none-any.whl (32.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page