Python bindings for eregex, an advanced regular expression engine inspired by mrab-regex
Project description
eregex (Python bindings)
Python bindings for eregex — an
advanced regular expression engine for Rust inspired by mrab-regex (the Python
regex module).
This package exposes eregex's full API to Python via PyO3 and ships as a wheel built with maturin. All matching logic runs in compiled Rust; the Python layer is a thin adapter.
Features
- Named groups, duplicate group names, repeated captures
- Greedy / lazy / possessive quantifiers, atomic groups
(?>...) - Variable-length lookbehind / lookahead
- Inline scoped flags
(?i),(?i-m:...) - Backreferences
\1,\g<name>,(?P=name) - Partial / end-anchored matching (
find_partial) find,match_at_start(Pythonre.match),fullmatch(re.fullmatch)replace,replace_allwith$1/${name}/$$templatessplit,escape, and more
Installation
The wheel is built from the Rust core:
cd crates/eregex-python
python -m venv .venv
. .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install maturin
maturin develop --release # editable install into the current venv
# or: maturin build --release && pip install target/wheels/eregex-*.whl
maturin develop installs an import eregex module into the active virtual
environment (the extension is named eregex).
Quick start
import eregex
re = eregex.Regex(r"(\w+)\s+(\w+)")
m = re.find("hello world")
m.matched # 'hello world'
m.group(1) # 'hello'
m.group(2) # 'world'
m[1] # 'hello' (Match is sequence-like)
# Flags: pass a bitset of the module-level constants, or parse a string.
eregex.Regex("hello", eregex.IGNORECASE).is_match("HELLO") # True
eregex.Regex("hello", eregex.parse_flags("i")).is_match("HELLO") # True
# Repeated captures (signature mrab-regex feature).
eregex.Regex(r"(\w)+").find("abc").captures(1) # ['a', 'b', 'c']
# Replace with named groups.
eregex.Regex(r"(?P<a>\d)(?P<b>\d)").replace_all("12 34", "${b}${a}") # '21 43'
Regex
class Regex:
def __init__(self, pattern: str, flags: int = 0): ...
@property
def pattern(self) -> str: ...
@property
def flags(self) -> int: ... # resolved (UNICODE + VERSION1 added)
@property
def capture_count(self) -> int: ... # excluding group 0
def group_names(self) -> list[str]: ...
def group_index(self, name: str) -> int | None: ...
def is_match(self, haystack: str) -> bool: ...
def find(self, haystack: str) -> Match | None: ...
def find_at(self, haystack: str, start: int) -> Match | None: ...
def match_at_start(self, haystack: str) -> Match | None: ... # re.match
def fullmatch(self, haystack: str) -> Match | None: ...
def findall(self, haystack: str) -> list[Match]: ...
def find_partial(self, haystack: str) -> PartialMatch | None: ...
def replace(self, haystack: str, repl: str) -> str: ...
def replace_all(self, haystack: str, repl: str) -> str: ...
def split(self, haystack: str) -> list[str]: ...
def dump(self) -> str: ... # AST debug aid
flags is a bitwise OR of the module-level constants: IGNORECASE,
MULTILINE, DOTALL, UNICODE, ASCII, VERBOSE, FULLCASE, WORD,
LOCALE, VERSION0, VERSION1. parse_flags("ims") parses a flag string
for re-familiar ergonomics.
Match
Match is sequence-like: len(m) is the number of groups (group 0 first),
and m[i] / m["name"] look up by index / name.
class Match:
@property
def matched(self) -> str # whole match (group 0)
@property
def group0(self) -> str # alias of matched
@property
def input(self) -> str
@property
def start(self) -> int # byte offset
@property
def end(self) -> int
@property
def span(self) -> tuple[int, int]
@property
def capture_count(self) -> int
@property
def groups(self) -> list[str | None]
@property
def named_groups(self) -> dict[str, str]
@property
def all_captures(self) -> list[list[str | None]]
@property
def captures_dict(self) -> dict[str, list[str | None]]
def group(self, *indices_or_names) -> ... # re.match.group semantics
def captures(self, index: int) -> list[str | None]
def captures_by_name(self, name: str) -> list[str | None]
def span_of(self, index: int = 0) -> tuple[int, int] | None
def start_of(self, index: int = 0) -> int
def end_of(self, index: int = 0) -> int
All offsets are byte offsets (UTF-8), matching Python's re and the Rust
core. None is returned for groups that did not participate.
PartialMatch
find_partial is end-anchored: the match must consume the input to its end.
class PartialMatch:
@property
def status(self) -> str # "full" | "partial"
@property
def is_full(self) -> bool
@property
def is_partial(self) -> bool
@property
def matched(self) -> str
@property
def start(self) -> int
@property
def end(self) -> int
@property
def capture_count(self) -> int
def group(self, index: int = 0) -> str | None
def named_group(self, name: str) -> str | None
def group_state(self, index: int = 0) -> str # "matched" | "partial" | "none"
Nonefromfind_partial→ the input cannot be a prefix of any match.status == "partial"→ the input is a valid prefix of some full match (more input could complete it).
re = eregex.Regex(r"token=([a-z]+)([0-9]+)")
p = re.find_partial("x token=abc")
p.is_partial # True
p.group(1) # 'abc'
p.group_state(1) # 'matched'
p.group_state(2) # 'partial' (entered but not completed)
re.find_partial("x token=abc!") # None — '!' rules out any continuation
Module-level helpers
escape(s: str) -> str
escape_special_only(s: str) -> str
escape_literal_spaces(s: str) -> str
is_match(pattern: str, haystack: str) -> bool # compiles pattern once
compile(pattern: str, flags: int = 0) -> Regex
parse_flags(flag_str: str) -> int
Testing
. .venv/bin/activate
maturin develop --release
python -m unittest test_eregex -v
Layout
This is one half of eregex's binding story. The same Rust core (eregex)
also ships Node.js bindings via napi-rs. See the project root for the core
crate and its feature matrix.
License
Apache-2.0, matching the upstream mrab-regex project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eregex-0.1.3-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: eregex-0.1.3-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 229.6 kB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1688e91da6668955dbe49fba1ec54ed2757c0936829088273d5ed05b5016632
|
|
| MD5 |
8a5d0ab49aa782177935ce8f270f424b
|
|
| BLAKE2b-256 |
1c13b6ce607c5d5653ad5c4b49072a7697b1dc696b887b47a0ae5348791fcac0
|
Provenance
The following attestation bundles were made for eregex-0.1.3-cp39-abi3-win_amd64.whl:
Publisher:
release.yml on a5i/eregex
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
eregex-0.1.3-cp39-abi3-win_amd64.whl -
Subject digest:
a1688e91da6668955dbe49fba1ec54ed2757c0936829088273d5ed05b5016632 - Sigstore transparency entry: 1903053707
- Sigstore integration time:
-
Permalink:
a5i/eregex@c12b9bfe21330948f5253ae717deb30ee94b8a56 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/a5i
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c12b9bfe21330948f5253ae717deb30ee94b8a56 -
Trigger Event:
push
-
Statement type:
File details
Details for the file eregex-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: eregex-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 350.3 kB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9fb2401ac618fe9fb95b2861426ef73354fdee950c5f6b9da5bc704553f0765
|
|
| MD5 |
0a108fa749dd2fa1d1365646cc3b75ad
|
|
| BLAKE2b-256 |
93b720ae0b93257493c78f1beba29935d9084ed5744946d9639f6496b139fe5d
|
Provenance
The following attestation bundles were made for eregex-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on a5i/eregex
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
eregex-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
f9fb2401ac618fe9fb95b2861426ef73354fdee950c5f6b9da5bc704553f0765 - Sigstore transparency entry: 1903053928
- Sigstore integration time:
-
Permalink:
a5i/eregex@c12b9bfe21330948f5253ae717deb30ee94b8a56 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/a5i
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c12b9bfe21330948f5253ae717deb30ee94b8a56 -
Trigger Event:
push
-
Statement type:
File details
Details for the file eregex-0.1.3-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: eregex-0.1.3-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 323.3 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f120d628c09cb464031bd8d5b9dcdf913ab9d54a31ae5acaba779d27416b9104
|
|
| MD5 |
20800e855b8b06215bd71503d9b92f8b
|
|
| BLAKE2b-256 |
8ed7f67a432d07e5a658d4c71037bb39d44f3f91974979a7666a64e655705385
|
Provenance
The following attestation bundles were made for eregex-0.1.3-cp39-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on a5i/eregex
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
eregex-0.1.3-cp39-abi3-macosx_11_0_arm64.whl -
Subject digest:
f120d628c09cb464031bd8d5b9dcdf913ab9d54a31ae5acaba779d27416b9104 - Sigstore transparency entry: 1903053822
- Sigstore integration time:
-
Permalink:
a5i/eregex@c12b9bfe21330948f5253ae717deb30ee94b8a56 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/a5i
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c12b9bfe21330948f5253ae717deb30ee94b8a56 -
Trigger Event:
push
-
Statement type: