Transform trie to regular expression
Project description
Efficient keyword extraction with regex
This package contains a function for efficiently representing a set of keywords as regex. This regex can be used to replace keywords in sentences or extract keywords from sentences
Why use tregex?
- Pure Python, no other dependencies
- trex is fast, about 300 times faster than a regex union, and about 2.5 times faster than FlashText
- Plays well with others, can be integrated easily with pandas
Install trex
Use pip,
pip install tregex
Usage
import tregex as tx
pattern = tx.compile(['baby', 'bat', 'bad'])
hits = pattern.findall('The baby was scared by the bad bat.')
# hits = ['baby', 'bat', 'bad']
Why the name?
Naming is difficult, but as we had to call it something:
- trex: trie to regex
- trex: Tyrannosaurus rex, a large dinosaur species with small arms (rex meaning "king" in Latin)
Acknowledgments
This project is based on the following resources:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file tregex-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: tregex-0.0.1-py3-none-any.whl
- Upload date:
- Size: 5.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd8aecc227c5c79b0995df89ac15cda75eb26d983d2412e2d68da45390889041 |
|
MD5 | c21d3d8d97b0e1ac6c7a0dfcd992c415 |
|
BLAKE2b-256 | e0e7ea37ca16f3a995823e392662fb515d551e8330b3c26155640b24291dd732 |