Blazingly fast Rust rewrite of pyparsing - 50-200x faster
Project description
pyparsing-rs
Rust rewrite of Python's pyparsing parser combinator library with pyo3 bindings. Target: 50-200x performance improvement with 100% API compatibility.
Mission
Work autonomously until 50x speedup is achieved on all benchmarks. Do not stop. Do not ask for permission. Iterate relentlessly.
Environment
- Rust: stable toolchain
- Project:
/home/aibrush/pyparsing-rs - Reference:
/home/aibrush/pyparsing-original(original pyparsing source + tests)
Commands
# Build
maturin develop --release
# Test
python -m pytest tests/ -v
# Benchmark
python tests/test_performance.py
# Full loop
maturin develop --release && python -m pytest tests/ -v && python tests/test_performance.py
# Profile when stuck
cargo flamegraph --release
# Install Python packages
uv pip install <package>
Architecture
src/
├── lib.rs # pyo3 module entry point
├── core/ # Core infrastructure
│ ├── parser.rs # ParserElement trait
│ ├── context.rs # Parse context, position tracking
│ ├── results.rs # ParseResults (list + dict)
│ ├── exceptions.rs # ParseException
│ └── memoization.rs # Packrat memoization
├── elements/ # Parser elements
│ ├── literals.rs # Literal, Keyword, CaselessLiteral
│ ├── chars.rs # Word, Char, CharsNotIn, Regex
│ ├── combinators.rs # And, Or, MatchFirst
│ ├── repetition.rs # ZeroOrMore, OneOrMore, Opt
│ ├── structure.rs # Group, Suppress, Combine
│ └── forward.rs # Forward (recursive grammars)
└── helpers/
└── common.rs # pyparsing_common equivalents
tests/
├── test_api_compat.py # Must match original pyparsing behavior
└── test_performance.py # Benchmark comparisons (goal: 50x)
test_grammars/ # Sample grammars
Implementation Priority
- ParserElement trait → 2. Literal, Keyword → 3. Word, Regex →
- And, Or, MatchFirst → 5. ZeroOrMore, OneOrMore → 6. Group, Suppress → 7. Forward
Code Rules
- Zero-copy: Use
&strslices, return indices into original string - Inline hot paths:
#[inline]and#[inline(always)]on frequently called methods - Avoid dyn trait: Use enum dispatch or generics for hot paths
- Fast hashing: Use FxHashMap from
rustc-hashfor memoization - API parity: Same class names, methods, operators as original pyparsing
- Cargo.toml: Enable
lto = true,codegen-units = 1in release profile
Python API to Match
import pyparsing_rs as pp
# Basic elements
lit = pp.Literal("hello")
word = pp.Word(pp.alphas(), pp.alphanums())
regex = pp.Regex(r"\d+")
# Combinators (via operators)
sequence = lit + word # And
first_match = lit | word # MatchFirst
longest_match = lit ^ word # Or
# Repetition
zero_or_more = pp.ZeroOrMore(word)
one_or_more = pp.OneOrMore(word)
optional = pp.Opt(word)
# Result manipulation
grouped = pp.Group(word + word)
suppressed = pp.Suppress(lit)
combined = pp.Combine(word + word)
# Recursive (Forward reference)
expr = pp.Forward()
expr <<= word | "(" + expr + ")"
# Parse
result = grammar.parse_string("input text")
result[0] # List access
result["name"] # Dict access (if named)
result.as_list() # Convert to list
result.as_dict() # Convert to dict
Testing Strategy
- Copy test files:
cp -r /home/aibrush/pyparsing-original/tests/* tests/ - Run baseline:
python baseline_benchmark.py→ savesbaseline_results.json - Compare: Rust implementation must return identical data to original
- Benchmark: Track speedup in
performance_results.json
Success Criteria
All must be true:
- All benchmarks show ≥50x speedup
- 100% of basic pyparsing tests pass
- Drop-in replacement API (same classes, methods, operators)
- Core elements: Literal, Word, Regex, And, Or, ZeroOrMore, Group, Forward
Key Performance Optimizations
Level 1 (do first):
- LTO + release builds
&strinstead of String- Inline small functions
Level 2 (when needed):
- Bitset for character class membership (O(1) lookup)
- Byte operations instead of char for ASCII
- SIMD scanning with
memchrcrate
Level 3 (if still slow):
- Packrat memoization with FxHashMap
- Arena allocation for ParseResults
- Enum dispatch instead of dyn trait
Important Notes
- Original pyparsing repo:
https://github.com/pyparsing/pyparsing - Test files are in
/home/aibrush/pyparsing-original/tests/ - Original pyparsing is editable-installed; use
import pyparsingfor reference import pyparsing_rsfor your Rust implementation- Never sacrifice correctness for speed - tests must pass
- Profile before optimizing - don't guess bottlenecks
pyparsing Key Concepts
Operator Overloading
a + b # And (sequence)
a | b # MatchFirst (first match wins)
a ^ b # Or (longest match wins)
~a # NotAny (negative lookahead)
a * 3 # Exactly 3 repetitions
ParseResults
Dual list/dict access:
result[0] # First element
result["key"] # Named element
result.key # Attribute access
for item in result: # Iteration
Whitespace
pyparsing auto-skips whitespace by default. Respect this behavior.
Parse Actions
User callbacks that transform results:
integer = Word(nums).set_parse_action(lambda t: int(t[0]))
NOTE: our github repo is: https://github.com/aibrushcomputer/pyparsing-rs
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyparsing_rs-0.2.0.tar.gz.
File metadata
- Download URL: pyparsing_rs-0.2.0.tar.gz
- Upload date:
- Size: 75.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6eef30bf1e2922c0e03f30c6d53a47dbe178f75c8f3833916566bb6c3509b5b2
|
|
| MD5 |
cb9d78006f39a2947c667e4d387adf7c
|
|
| BLAKE2b-256 |
fdcf77d228b60ff65b2a7e7e37dcdbc17f20906fd49e0f48dd6a11c262ec2513
|
File details
Details for the file pyparsing_rs-0.2.0-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 755.2 kB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
879471be32fcbc80d6af8d08854011ff90c1ed403997d346a9bda9420119aca8
|
|
| MD5 |
9908babe08ae8698985f05d4d66b53ca
|
|
| BLAKE2b-256 |
b99cec32533587fea9fde6331052671068ba4c530023a0245ee5b5e378acff37
|
File details
Details for the file pyparsing_rs-0.2.0-cp313-cp313-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp313-cp313-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 883.7 kB
- Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f85fb35e720fb76133ecf3b133a73b676d60ed0fe4bd19c17826cc09620b447a
|
|
| MD5 |
a5c4a48f2c2047fdc6ce8483e9075303
|
|
| BLAKE2b-256 |
aae928057a9057648407387430aef0d8c955cb280462187c9c4303cdd8ee107e
|
File details
Details for the file pyparsing_rs-0.2.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 739.3 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa72968f8784b0f5f10104344096222353b23c32fb04f1004717f8054c2fdc5a
|
|
| MD5 |
71212f44993e487e7be6bb43a5731bce
|
|
| BLAKE2b-256 |
73d4a37db5444dce94c8eef197ff284df5edbf270b3b10317185feb6d81df1dd
|
File details
Details for the file pyparsing_rs-0.2.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 755.4 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63ca3cf9f3379924982fdd654cffb4c3fae2b6d570234472fc4d23bfe1099022
|
|
| MD5 |
4254a857fb712fea12cb6f6ed02bd661
|
|
| BLAKE2b-256 |
a3c156a277797833069e0276544df6e43d42a6b03117ed8895fb6945dcd301d8
|
File details
Details for the file pyparsing_rs-0.2.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 883.8 kB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb9e8f2f4ce2f50b7dbd2df831f2d6dbc8cb18dbce78d7168a5ec3d0a9cd57a5
|
|
| MD5 |
2e99e6e9a8e3286055da0d7392f36f8b
|
|
| BLAKE2b-256 |
7e1acbd0e40626d842c234757afc52883def15709ed3f5a6755fabdc89e2b830
|
File details
Details for the file pyparsing_rs-0.2.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 739.2 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
347bcafdee5bee6c72f438c516f51afddaa539432125ee68b5931ecd247e7450
|
|
| MD5 |
80477b52d9a626e1cb284caffaf372ee
|
|
| BLAKE2b-256 |
a3c9c2ed2b9a8b6fda6475ae30fb462508aa2ec6c170ee4134369bb3749ba196
|
File details
Details for the file pyparsing_rs-0.2.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 753.3 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4732dababc4d08e230c4bb88ed6802f10ede415c69a0683f94cb6c8eb32971a
|
|
| MD5 |
143f25a2d3c1c73530b82889c4c1b160
|
|
| BLAKE2b-256 |
524d6ab1f27560ea32fff9823f7b3646af0de752b71a2da3abb8737a3f4b2b74
|
File details
Details for the file pyparsing_rs-0.2.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 883.5 kB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3e695fb188f20d6b52212fc384b10a764a0dad56c1b8d79856e1d1f6613af3b
|
|
| MD5 |
99f69ab463f14b7587bd6a667bf5605c
|
|
| BLAKE2b-256 |
2cbcd2a2693e8af340e7683342e2096c910279f3dbf579689a5e103062bc78c6
|
File details
Details for the file pyparsing_rs-0.2.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 739.7 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f4edb8892e52379610d02dfe99a3e6b6fa40278e65dc8e00ed18e5286895cb2
|
|
| MD5 |
2a679f3d7ef7768f4b00abcab0c4cdae
|
|
| BLAKE2b-256 |
8f404b872dd335f045af79db2f2526a32adf6bfc88281593a0ebeb2bd1685ca7
|
File details
Details for the file pyparsing_rs-0.2.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 753.5 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffeff96a463a4d01f957c807b02d33e9f70d804119a99920d1c2ccbb4d48ed0e
|
|
| MD5 |
751e5d94cd3e93d1a1d828d2669ca658
|
|
| BLAKE2b-256 |
770b821b557dffebe8afc5affa4d44eedda95ec52a600dcdcc9412401b82c771
|
File details
Details for the file pyparsing_rs-0.2.0-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 883.8 kB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
270d42bd9ad42a9d1f01754b2c78f65b79969d9d0a30b4eb961d127f961815a0
|
|
| MD5 |
2acf97526609c47f13797d3cb2ac1dd5
|
|
| BLAKE2b-256 |
cfefae87a5c1a3280bba6dabed7301b70bdc05edb1fd24a5014b8f9e1ee5cd76
|
File details
Details for the file pyparsing_rs-0.2.0-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 739.6 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd0576bc5c06c7d2fcd036d6ed53adf9d27ff63b1b0ca6dbb41c8efe5530c61e
|
|
| MD5 |
ee4282f215c2161f9ae59e8ab26f9627
|
|
| BLAKE2b-256 |
16463b99aff65f8b00400c55c299121ff9b6274f2e07511dabd574c911c926a6
|
File details
Details for the file pyparsing_rs-0.2.0-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 753.5 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c008f5f9ff3ef20a5ead4702eb06c0f3774361914a7ce867756b42081575b3f3
|
|
| MD5 |
17b77ea42245b0ceadfd94134f71eb37
|
|
| BLAKE2b-256 |
7fa78303e4c54dd7aa532d6551ad9418e843ae2becc906d9486ae7075e5af74b
|
File details
Details for the file pyparsing_rs-0.2.0-cp39-cp39-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp39-cp39-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 884.1 kB
- Tags: CPython 3.9, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e7136acb99b52e019652e1acd3322f1c568beb302c119ad5e3c52e0b4ce60ce
|
|
| MD5 |
10cc4cc5dae19481c7e5be75ed5b8043
|
|
| BLAKE2b-256 |
95f482a40e761e97f5c603833b388a7b97696f5444c7639e8e085bd5fd4ed5e0
|
File details
Details for the file pyparsing_rs-0.2.0-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: pyparsing_rs-0.2.0-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 739.9 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bbde9a7eaa6ce91999c89308d16fb00f14f2dbd39210115054cda4e4197025b
|
|
| MD5 |
a8cc7d10756b161d535f7300886462ed
|
|
| BLAKE2b-256 |
4fa49f89908fc129ef21b934c5ddd60c36c0b2d60e164b581e630dbd764a737f
|