A library which extends regex with support for datetime format codes.
Project description
datetime-matcher
datetime-matcher is python module that enables an extension of regex which allows matching, extracting, and reformatting stringified datetimes.
Most notably, it provides a function which essentially combines the re.sub
,
datetime.strptime
, and datetime.strftime
standard library functions and does all
the complicated parsing and wiring for you.
It's mighty useful for doing things like bulk-renaming files with datetimes in their filenames. But don't let us tell you what it's good for—give it a try yourself!
Installation
Get it from pypi now by running
pip install datetime-matcher
Example of String Substitution with Datetime Reformatting
Let's say we have several filenames of the following format that we want to rename:
'MyLovelyPicture_2020-Mar-10.jpeg'
We want to change them to look like this string:
'20200310-MyLovelyPicture.jpg'
The Unclean Way to Do It, without datetime-matcher
Using the standard library re.sub
, we run into an issue:
text = 'MyLovelyPicture_2020-Mar-10.jpeg'
search = r'(\w+)_([0-9]{4}-\w{3}-[0-9]{2})\.jpe?g' # ❌ messy
replace = r'(??????)-\1.jpg' # ❌ what do we put for ??????
result = re.sub(search, replace, text) # ❌ This does't work
We have to manually run datetime.strptime
with a custom parser string to extract the
date, and then manually insert it back into the replacement string before running
a non-generic search-and-replace using the customized replacement string.
Yuck.
The Clean Way to Do It, with datetime-matcher
We can do the following for a quick and easy substitution with reformatting.
from datetime_matcher import DatetimeMatcher
dtmatcher = DatetimeMatcher()
text = 'MyLovelyPicture_2020-Mar-10.jpeg'
search = r'(\w+)_%Y-%b-%d\.jpe?g' # ✅
replace = r'%Y%m%d-\1.jpg' # ✅
result = dtmatcher.sub(search, replace, text) # ✅
# result == '20200310-MyLovelyPicture.jpg' # ✅
Features
The library features a class DatetimeMatcher
which provides the following
public-facing methods:
sub
def sub(self, search_dfregex: str, replacement: str, text: str, count: int = 0) -> str
- Replace the matching instances of the search dfregex in the given text with the replacement regex, intelligently transferring the matching date from the original text to the replaced text for each regex match.
- If no matches are found, the original text is returned.
- Use a non-zero count to limit the number of extractions.
- Use strftime codes within a dfregex string to extract/place datetimes.
match
def match(self, search_dfregex: str, text: str) -> Optional[Match[AnyStr]]
- Determines if text matches the given dfregex.
- Return the corresponding match object if found, otherwise returns None.
- Use strftime codes within a dfregex string to extract/place datetimes.
get_regex_from_dfregex
def get_regex_from_dfregex(self, dfregex: str, is_capture_dfs: bool = False) -> str
- Converts a dfregex to its corresponding conventional regex.
- By default, the datetime format groups are NOT captured.
- Use strftime codes within a dfregex string to match datetimes.
extract_datetimes
def extract_datetimes(self, dfregex: str, text: str, count: int = 0) -> Iterable[datetime]
- Extracts the leftmost datetimes from text given a dfregex string.
- Returns an Iterable of datetime objects.
- Use a non-zero count to limit the number of extractions.
- Use strftime codes within a dfregex string to match datetimes.
extract_datetime
def extract_datetime(self, dfregex: str, text: str) -> Optional[datetime]
- Extracts the leftmost datetime from text given a dfregex string.
- Returns the matching datetime object if found, otherwise returns None.
- Use strftime codes within a dfregex string to match datetimes.
dfregex Syntax
The syntax for dfregex is nearly identical to that of conventional python regex. There is only one addition and one alteration to support datetime format codes.
The Datetime Format Codes
The percentage character indicates the beginning of a datetime format code. These codes
are the standard C-style ones used in the built-in datetime
module for strftime
.
For a full list of codes, see the Python docs.
NOTE: The following codes are currently not supported: %Z, %c, %x, %X
The Percent Literal (%)
The percentage literal in conventional regex (%
) must be escaped in dfregex (\%
)
because an unescaped one marks the beginning of a datetime format code and otherwise would be
ambiguous.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for datetime_matcher-0.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de247fb7d0590fe26ea6be22b95f456e760c5793cf6e74fc8fba86079097fd6a |
|
MD5 | 09cddbf8a8fb6c8d56e67cc9ed6a3c49 |
|
BLAKE2b-256 | e5c5244f84eb0264e8d822e8d1c816695beecbe74b90bd74649ee97ffa7120f8 |