Trie-ized regular expressions in python
Project description
About
triegex is a library that builds a compact trie-structured regular expressions from a list of words.
Installation
pip install git+https://github.com/ZhukovAlexander/triegex.git
Example usage
>>> import triegex
>>>
>>> t = triegex.Triegex('foo', 'bar', 'baz')
>>>
>>> t.to_regex() # build regular expression
'(?:ba(?:r\\b|z\\b)|foo\\b|~^(?#match nothing))'
>>>
>>> t.add('spam')
>>>
>>> 'spam' in t # you check if the word is in there
True
>>>
>>> import re
>>> re.findall(t.to_regex(), 'spam & eggs') # ['spam']
['spam']
Why?
The library was inspired by a need to match a list of valid IANA top-level domain names (which is pretty big).
Also it’s fun
triegex was influenced by these projects: frak, regex-trie and Regexp-Trie
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
triegex-0.0.2.tar.gz
(5.0 kB
view hashes)