Wrapper for more functionality out of regex parse results.
Project description
tregex
tregex
is a wrapper around Python regular expressions that makes usage smoother and more user friendly.
Install
pip install tregex-tobiasli
Usage
import tregex as tr t = tr.to_tuple(pattern='([^;]+?)@(.+?)\.([^;]+)', string='john.smith@somewhere.co.uk; hackzor@coolface.com') assert t[0][1] == 'somewhere' assert t[1][2] == 'com' pattern = '(?P<name>[^;]+?)@(?P<address>.+?)\.(?P<domain>[^;]+)' t = tr.to_dict(pattern=pattern, string='john.smith@somewhere.co.uk; hackzor@coolface.com') assert t[0]['name'] == 'john.smith' assert t[1]['address'] == 'coolface' t = tr.to_object(pattern=pattern, string='john.smith@somewhere.co.uk; hackzor@coolface.com') assert t[0].name == 'john.smith' assert t[1].address == 'coolface'
The above methods patterns can be either a string or a compiled regular expression. TregexCompiled
is a class for simply
containing the compiled regex to be run on the above methods. If patterns are long, this usage will speed things up
considerably.
from tregex import TregexCompiled pattern = '(?P<name>[^;]+?)@(?P<address>.+?)\.(?P<domain>[^;]+)' trc = TregexCompiled(pattern) t = trc.to_object('john.smith@somewhere.co.uk; hackzor@coolface.com') assert t[0].name == 'john.smith'
tregex also contains some methods for simply fuzzy text matching using difflib.SequenceMatcher
:
from tregex import find_best places_in_wales = ['Llanaber', 'Llanaelhaearn', 'Llanbedr', 'Llandbedrgoch', 'Llanbedrog', 'Llanberis', 'Llandanwg', 'Llanegryn', 'Llandegwning', 'Llandeiniolen', 'Llandwrog'] best = find_best('Llanberris', places_in_wales) assert best == 'Llanberis'
The other methods are find
, find_scores
(returns the matched scores along with the candidate) and similarity
(which
returns the score between a single pair of strings).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.