High-performance PEG parsing (a port of TatSu to Rust)
Project description
铁修 TieXiu
A high-performance port of TatSu to Rust.
TieXiu (铁修) is a PEG (Parsing Expression Grammar) engine that implements the flexibility and power of the original **TatSu ** lineage into a memory-safe, high-concurrency architecture optimized for modern CPU caches.
TieXiu is a tool that takes grammars in extended EBNF_ as input, and
outputs memoizing_ (Packrat) PEG parsers as a Rust model. The classic
variations of EBNF_ (Tomassetti, EasyExtend, Wirth) and ISO EBNF_ are
supported as input grammar formats.
The TatSu Documentation provides a vision of where the TieXiu project is heading. A copy of the grammar syntax can can be accessed locally in the SYNTAX document.
TieXiu is foremost a Rust library that is also published as a Python library with the help of PyO3/Maturin. The Rust API may return objects of types in the internal parser or tree model. The Python API has strings as input and json.dumps() compatible Python objects as output.
TatSu is a mature project with an important user base so it's difficult to make certain changes even if they are improvements or fixes for long-standing quirks (as well known within experienced software engineers, a long-lived quirk becomes a feature). TieXiu is an opportunity to start from scratch, with a modern approach, even if the grammar syntax and its semantics are preserved.
Non-Features
Most features of TatSu are available in TieXiu. Some features have not yet been implemented, and a few never will:
- Generation of synthetic classes from grammar parameters will not be implemented in Rust.
- Generation of source code with an object model for deinitions in the grammar may be implemented if a way is found to make the parser or postprocessing bind the Tree output of a parse to the model (serde_json provides the infrastructure for trying).
- Code generation of a parser recently moved in TatSu to the loading of a model of the Grammar and using it as parser. Although the generated procedural parser may produce 1.3x increased throughput in Python, supporting generated code is hard and it complicates the internal interfaces. For Rust, TieXiu alreay knows how to load fast a Grammar model from TatSu JSON, which it can already produce. and a generated model constructor would be precompiled.
- Parsing of boolean and numeric values happens in TatSu through synthetic models, which call the constructors for those types passing the parsed strings.
- Interpolation and evaluation of `constant` expressions hasn't had any known use cases with TatSu. They will not be implemented in TieXiu until a use case appears.
API
The needs of most users are met by parsing input with the rules in a grammar and reciving the structure output as a JSON-compatible value. For other use cases, TieXiu exposes its internal model and APIs (to be docummented).
The Python API
The return values of Any are of the basic Python types, as defined in the json module documentation (see Encoders and Decoders ).
| JSON | Python |
|---|---|
| object | dict |
| array | list |
| string | str |
| number (int) | int |
| number (real) | float |
| true | True |
| false | False |
| null | None |
Keyword arguments can be passed for runtime configuration. The only recognized argument is trace=.
These functions are available from package tiexiu.
def parse(grammar: str, text: str, **kwargs: Any) -> Any
def parse_grammar(grammar: str, **kwargs: Any) -> Any:
def parse_grammar_to_json(grammar: str, **kwargs: Any) -> Any:
def parse_to_json(grammar: str, text: str, **kwargs: Anyt) -> Any:
def pretty(grammar: str, **kwargs: Any) -> str:
def compile_to_json(grammar: str, **kwargs: Any) -> Any:
The Rust API
pub fn parse_grammar(grammar: &str, cfg: &CfgA) -> Result<Tree>;
pub fn parse_grammar_to_json(grammar: &str, cfg: &CfgA) -> Result<serde_json::Value>;
pub fn parse_grammar_to_json_string(grammar: &str, cfg: &CfgA) -> Result<String>;
pub fn parse_grammar_with<U>(cursor: U, cfg: &CfgA) -> Result<Tree>
pub fn parse_grammar_to_json_with<U>(cursor: U, cfg: &CfgA) -> Result<serde_json::Value>
pub fn compile(grammar: &str, cfg: &CfgA) -> Result<Grammar>;
pub fn compile_to_json(grammar: &str, cfg: &CfgA) -> Result<serde_json::Value>;
pub fn compile_to_json_string(grammar: &str, cfg: &CfgA) -> Result<String>;
pub fn compile_with<U>(cursor: U, cfg: &CfgA) -> Result<Grammar>
pub fn compile_to_json_with<U>(cursor: U, cfg: &CfgA) -> Result<serde_json::Value>
pub fn load(json: &str, _cfg: &CfgA) -> Result<Grammar>;
pub fn load_to_json(json: &str, cfg: &CfgA) -> Result<serde_json::Value>;
pub fn load_tree(json: &str, _cfg: &CfgA) -> Result<Tree>;
pub fn load_tree_to_json(json: &str, cfg: &CfgA) -> Result<serde_json::Value>;
pub fn grammar_pretty(grammar: &str, cfg: &CfgA) -> Result<String>;
pub fn pretty_tree(tree: &Tree, _cfg: &CfgA) -> Result<String>;
pub fn pretty_tree_json(tree: &Tree, _cfg: &CfgA) -> Result<String>;
pub fn parse(grammar: &str, text: &str, cfg: &CfgA) -> Result<Tree>;
pub fn parse_to_json(grammar: &str, text: &str, cfg: &CfgA) -> Result<serde_json::Value>;
pub fn parse_to_json_string(grammar: &str, text: &str, cfg: &CfgA) -> Result<String>;
pub fn parse_input(parser: &Grammar, text: &str, cfg: &CfgA) -> Result<Tree>;
pub fn parse_input_to_json(parser: &Grammar, text: &str, cfg: &CfgA) -> Result<serde_json::Value>;
pub fn parse_input_to_json_string(parser: &Grammar, text: &str, cfg: &CfgA) -> Result<String>;
Roadmap
The project is functionally complete, as described before. Comments about the implementation strategies and possible improvements are now in RODADMAP.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless explicitly stated otherwise, any contribution intentionally submitted for inclusion in the work, as defined in the Apache-2.0 license, shall be dual-licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tiexiu-0.1.1a8.tar.gz.
File metadata
- Download URL: tiexiu-0.1.1a8.tar.gz
- Upload date:
- Size: 588.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
184cb774b8871ee7777a5b0452c6478b09fe3aa2f3cafc194496c2c414cc6f58
|
|
| MD5 |
81defe28ae235cc2e19c74c32ed1eb41
|
|
| BLAKE2b-256 |
c7d84bde080e07308ca7e9813f6296758671e1605860487e0861f8dcf8138a4d
|
Provenance
The following attestation bundles were made for tiexiu-0.1.1a8.tar.gz:
Publisher:
release.yml on neogeny/TieXiu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiexiu-0.1.1a8.tar.gz -
Subject digest:
184cb774b8871ee7777a5b0452c6478b09fe3aa2f3cafc194496c2c414cc6f58 - Sigstore transparency entry: 1397550374
- Sigstore integration time:
-
Permalink:
neogeny/TieXiu@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/neogeny
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tiexiu-0.1.1a8-cp312-abi3-win_amd64.whl.
File metadata
- Download URL: tiexiu-0.1.1a8-cp312-abi3-win_amd64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.12+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ff183ccdcff739fabcc0ed2a2e98e4b421996f6f9a2b063d20ea0103cb55892
|
|
| MD5 |
2c693ab0806525f28723a9a1afce3335
|
|
| BLAKE2b-256 |
bcbf098506b074d2c14401a8177c7c4d11d3f9b10773c964f42dadadaaa7a666
|
Provenance
The following attestation bundles were made for tiexiu-0.1.1a8-cp312-abi3-win_amd64.whl:
Publisher:
release.yml on neogeny/TieXiu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiexiu-0.1.1a8-cp312-abi3-win_amd64.whl -
Subject digest:
5ff183ccdcff739fabcc0ed2a2e98e4b421996f6f9a2b063d20ea0103cb55892 - Sigstore transparency entry: 1397550419
- Sigstore integration time:
-
Permalink:
neogeny/TieXiu@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/neogeny
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tiexiu-0.1.1a8-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: tiexiu-0.1.1a8-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.4 MB
- Tags: CPython 3.12+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e74afcc93732ca919a67cf087f7c0c578ef9ec361cb7a75bf02ca500b2c7b582
|
|
| MD5 |
ad6279f881e035774760b589062ac05e
|
|
| BLAKE2b-256 |
1b3b5813d81f7ae63837021b7e196071e8610381cf37c34c1bff356000ca9b28
|
Provenance
The following attestation bundles were made for tiexiu-0.1.1a8-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on neogeny/TieXiu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiexiu-0.1.1a8-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
e74afcc93732ca919a67cf087f7c0c578ef9ec361cb7a75bf02ca500b2c7b582 - Sigstore transparency entry: 1397550548
- Sigstore integration time:
-
Permalink:
neogeny/TieXiu@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/neogeny
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tiexiu-0.1.1a8-cp312-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: tiexiu-0.1.1a8-cp312-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.3 MB
- Tags: CPython 3.12+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9e09e579b07557ff5c6f48bedc136a1ffaa0028795a7fa5d1774cf3d3c86ee6
|
|
| MD5 |
f6f2869fee83ddb3afeceaf812cabce6
|
|
| BLAKE2b-256 |
1c92f4f7e158ec48b6bba783290ef4f2c62b2bcdbef6e4780d1e1c68be650b73
|
Provenance
The following attestation bundles were made for tiexiu-0.1.1a8-cp312-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on neogeny/TieXiu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiexiu-0.1.1a8-cp312-abi3-macosx_11_0_arm64.whl -
Subject digest:
c9e09e579b07557ff5c6f48bedc136a1ffaa0028795a7fa5d1774cf3d3c86ee6 - Sigstore transparency entry: 1397550470
- Sigstore integration time:
-
Permalink:
neogeny/TieXiu@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/neogeny
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tiexiu-0.1.1a8-cp312-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: tiexiu-0.1.1a8-cp312-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 2.3 MB
- Tags: CPython 3.12+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d63ac527fb6a167fc9a334a29dfeb72f39d86060eb1c1886e84d1e6ca48b8ff9
|
|
| MD5 |
e0145f2b262e7997c7e7c72c89077e6f
|
|
| BLAKE2b-256 |
338ad2b30c09d0d0ea03dba5b439f3e4ca47360696e5986c0c90300260ec362a
|
Provenance
The following attestation bundles were made for tiexiu-0.1.1a8-cp312-abi3-macosx_10_12_x86_64.whl:
Publisher:
release.yml on neogeny/TieXiu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tiexiu-0.1.1a8-cp312-abi3-macosx_10_12_x86_64.whl -
Subject digest:
d63ac527fb6a167fc9a334a29dfeb72f39d86060eb1c1886e84d1e6ca48b8ff9 - Sigstore transparency entry: 1397550508
- Sigstore integration time:
-
Permalink:
neogeny/TieXiu@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/neogeny
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e5ac29e67a168fcf0395a7862c771f451ff0c0e1 -
Trigger Event:
workflow_dispatch
-
Statement type: