The modern PEG parser combinator for python
Project description
# Trishula - The modern parser combinator for Python 3
Trishula is a parser combinator library extended PEG syntaxes, inspired by Parsimmon(ES) and boost::spirit::qi(C++).
Trishula supports python version >= 3.7.0
## Examples
```python
grammar = (
Value("aaa")
>> (Value("bbb") | Value("ccc"))
>> (+Value("eee") >= (lambda x: "modified"))
>> -Value("f")
>> Value("g")
>> Regexp(r"a+")
>> Not(Value("hhh"))
)
# This works
print(vars(Parser().parse(grammar, "aaaccceeeeeeeeeeeefgaaa")))
# {
# 'status': <Status.SUCCEED: 1>,
# 'index': 23,
# 'value': [[[[[['aaa', 'ccc'], 'modified'], 'f'], 'g'], 'aaa'], None]
# }
```
You can see examples in ["example" directory](https://github.com/minamorl/trishula/blob/master/example) (execute it under example directory).
## Description
Grammers can be defined by **Value** and **Regexp** primitive and operators. Below we describe operators.
## Operators
As mentioned above, Trishula uses many operator overloads to make definition of parsers be easier.
| operator | result |
----|----
| >> | Sequence |
| \| | OrderedChoise |
| ~ | ZeroOrMore |
| + | OneOrMore |
| - | Optional |
| >= | Map |
| @ | NamedParser |
and we have classes named **Not** and **And**, which are made for prediction.
## Recursion
Trishula supports recursion with `Ref`. Recursion can be written like this:
```python
def grammar():
return (
(Value("[]") >= (lambda x: [])) |
((
Value("[") >>
Ref(grammar) >>
Value("]")
) >= (lambda x: [x[0][1]]))
)
def main():
result = Parser().parse(grammar(), "[[[]]]")
print(vars(result))
# => {'status': <Status.SUCCEED: 1>, 'index': 6, 'value': [[[]]]}
```
Be aware that `Ref` executes function only once so that parser can be memorized.
## Namespace
Namespace is one of Trishula's powerful features. You can name your parser and retrieve values with map (as dict).
Usage is simple. Mark the parser with `@` operator like `parser @ "name"` and surround with `Namespace(parser)`. Then you can grab values with `Namespace(parser) => fn`. fn is a callable taking dict type and returns new value.
```python
import trishula as T
def main():
grammar = T.Namespace(
T.Value("[") >> (T.Regexp(r"[0-9]+") >= (float)) @ "value" >> T.Value("]")
) >= (lambda a_dict: a_dict["value"])
result = T.Parser().parse(grammar, "[12345]")
print(vars(result))
# ==> {'status': <Status.SUCCEED: 1>, 'index': 7, 'value': 12345.0, 'namespace': {}}
main()
```
Note that after mapped function called, internal namespace is cleaned up with empty dict.
## Conditional parsing
You can do something like this:
```python
def main():
def cond(value):
d = {
"(": ")",
"{": "}",
"[": "]",
}
return T.Value(d.get(value[0]))
grammar = T.Namespace(
T.Value("[")
>> +(T.Regexp(r"[a-z]") | T.Value("\n")) @ "value"
>> T.Conditional(cond)
)
result = T.Parser().parse(grammar, "[abcd\n\nefg]")
print(result)
main()
```
`Conditional` take one argument that receive a value and return parser. It runs dynamically so that you can choose a parser at runtime.
## Utils
There are `sep_by`, `sep_by1`, and `index`.
## Generator
```
import trishula as T
@T.define_parser
def parser():
yield T.Value("aaa")
v = yield T.Value("bbb")
yield T.Value("ccc")
# Do not forget to return a value
yield v
print(T.Parser().parse(parser, "aaabbbccc"))
# ==> <Success index='9' value='bbb' namespace='{}'>
```
Trishula is a parser combinator library extended PEG syntaxes, inspired by Parsimmon(ES) and boost::spirit::qi(C++).
Trishula supports python version >= 3.7.0
## Examples
```python
grammar = (
Value("aaa")
>> (Value("bbb") | Value("ccc"))
>> (+Value("eee") >= (lambda x: "modified"))
>> -Value("f")
>> Value("g")
>> Regexp(r"a+")
>> Not(Value("hhh"))
)
# This works
print(vars(Parser().parse(grammar, "aaaccceeeeeeeeeeeefgaaa")))
# {
# 'status': <Status.SUCCEED: 1>,
# 'index': 23,
# 'value': [[[[[['aaa', 'ccc'], 'modified'], 'f'], 'g'], 'aaa'], None]
# }
```
You can see examples in ["example" directory](https://github.com/minamorl/trishula/blob/master/example) (execute it under example directory).
## Description
Grammers can be defined by **Value** and **Regexp** primitive and operators. Below we describe operators.
## Operators
As mentioned above, Trishula uses many operator overloads to make definition of parsers be easier.
| operator | result |
----|----
| >> | Sequence |
| \| | OrderedChoise |
| ~ | ZeroOrMore |
| + | OneOrMore |
| - | Optional |
| >= | Map |
| @ | NamedParser |
and we have classes named **Not** and **And**, which are made for prediction.
## Recursion
Trishula supports recursion with `Ref`. Recursion can be written like this:
```python
def grammar():
return (
(Value("[]") >= (lambda x: [])) |
((
Value("[") >>
Ref(grammar) >>
Value("]")
) >= (lambda x: [x[0][1]]))
)
def main():
result = Parser().parse(grammar(), "[[[]]]")
print(vars(result))
# => {'status': <Status.SUCCEED: 1>, 'index': 6, 'value': [[[]]]}
```
Be aware that `Ref` executes function only once so that parser can be memorized.
## Namespace
Namespace is one of Trishula's powerful features. You can name your parser and retrieve values with map (as dict).
Usage is simple. Mark the parser with `@` operator like `parser @ "name"` and surround with `Namespace(parser)`. Then you can grab values with `Namespace(parser) => fn`. fn is a callable taking dict type and returns new value.
```python
import trishula as T
def main():
grammar = T.Namespace(
T.Value("[") >> (T.Regexp(r"[0-9]+") >= (float)) @ "value" >> T.Value("]")
) >= (lambda a_dict: a_dict["value"])
result = T.Parser().parse(grammar, "[12345]")
print(vars(result))
# ==> {'status': <Status.SUCCEED: 1>, 'index': 7, 'value': 12345.0, 'namespace': {}}
main()
```
Note that after mapped function called, internal namespace is cleaned up with empty dict.
## Conditional parsing
You can do something like this:
```python
def main():
def cond(value):
d = {
"(": ")",
"{": "}",
"[": "]",
}
return T.Value(d.get(value[0]))
grammar = T.Namespace(
T.Value("[")
>> +(T.Regexp(r"[a-z]") | T.Value("\n")) @ "value"
>> T.Conditional(cond)
)
result = T.Parser().parse(grammar, "[abcd\n\nefg]")
print(result)
main()
```
`Conditional` take one argument that receive a value and return parser. It runs dynamically so that you can choose a parser at runtime.
## Utils
There are `sep_by`, `sep_by1`, and `index`.
## Generator
```
import trishula as T
@T.define_parser
def parser():
yield T.Value("aaa")
v = yield T.Value("bbb")
yield T.Value("ccc")
# Do not forget to return a value
yield v
print(T.Parser().parse(parser, "aaabbbccc"))
# ==> <Success index='9' value='bbb' namespace='{}'>
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
trishula-0.0.8.tar.gz
(4.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trishula-0.0.8.tar.gz.
File metadata
- Download URL: trishula-0.0.8.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a10f9acefaa9b0bc92f12b2be5448e04ed7c58805c50ac58917638a2f53f5ab
|
|
| MD5 |
45fcf279a0c08acafd6b685d3b97e5b0
|
|
| BLAKE2b-256 |
06aa642e8b6b8e0e359db49fab2332858ceab4000a7f1f0f4792709c7c15d6af
|
File details
Details for the file trishula-0.0.8-py3-none-any.whl.
File metadata
- Download URL: trishula-0.0.8-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4d6af33c0a771116922cae32a41d271bc9fa515cc6cc4b8a1bcecb9fefa25e9
|
|
| MD5 |
c20f5b2ef56f7942526fb900e8453cbe
|
|
| BLAKE2b-256 |
a6459cdf822f6f11e372ba3ba38c26eee2cd00d57d8142b693212a8bc22b6a1e
|