Python cli tool to redact sensitive data
Project description
CommonRegex Improved
An improved version of commonly used regular expressions in Python
Inspired by and improved upon CommonRegex
This is a collection of commonly used regular expressions. This library provides a simple API interface to match the strings corresponding to specified patterns.
Installation
pip install --upgrade commonregex-improved
Usage
import commonregex-improved as CommonRegex
text = "John, please get that article on www.linkedin.com to me by 5:00PM on Jan 9th 2012. 4:00 would be ideal, actually. If you have any questions, You can reach me at (519)-236-2723x341 or get in touch with my associate at harold.smith@gmail.com"
date_list = CommonRegex.dates(text)
# ['Jan 9th 2012']
time_list = CommonRegex.times(text)
# ['5:00PM', '4:00']
url_list = CommonRegex.links(text)
# ['www.linkedin.com', 'harold.smith@gmail.com']
phone_list = CommonRegex.phones_with_exts(text)
# ['(519)-236-2723x341']
email_list = CommonRegex.emails(text)
# ['harold.smith@gmail.com']
identify_all = CommonRegex.find_all(text)
# Do note that the regexes might clash for this find_all function
# ['Jan 9th 2012', '5:00', '(519)-236-2723', '(519)-236-2723x341', 'harold.smith@gmail.com', 'www.linkedin.com']
⚔️ Performance benchmark
CommonRegex is awesome!
So why re-implement the popular original commonregex project? The API calls to each of the regular expressions are really slow. It takes 12 seconds for a total of 2999 calls to Dates function in the original version of CommonRegex.
Here is the improved version of CommonRegex with the same number of calls. It merely takes 2 seconds.
You can find more detailed results about original and improved versions.
Supported methods
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for commonregex-improved-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f219fcaa699218ab3476b4128ad8736c26979673b0cea45a385a486588355c1 |
|
MD5 | 8650acc537547853e8309f6c3921f541 |
|
BLAKE2b-256 | 288d0a1cd4e3a96197fbbc46be20da80e40a7bc97136becaed1a5bf965690ca7 |
Hashes for commonregex_improved-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55460e6482da6e0aae55458081914a0d53877bff2418adba0302be526955972c |
|
MD5 | 1c9c8380c4a16d359e8559390b437b64 |
|
BLAKE2b-256 | 65225ee9d53c5e2f38d920f2fb4ce58e38898966836aafcf12c7eef1c9e1252d |