A simple lib to find dates from any txt/ pdf/ rtf source. For documentation see
Project description
Simple use Date and text parsing from pdf rtf and images (with use of call back function)
This is a simple package provided by Marvsai healthcare LTD. It can find any format regular dates in a str as python Datetime objects.
Easy to use method
def find_dates(file_contents: str):
Find any dates in a large python string usually taken from a file or pdf
Args:
file_contents (str): The string in which to find any format of dates
Returns:
List[datetime.datetime]: A list of datetime objects the latest can be found using max()
Optimised replacement of multiple strings in a string
replace_multiple_strings(input_string, replacements_dict)
Replace multiple strings in the input string using a dictionary of replacement pairs.
Args:
input_string (str): The string in which to replace the substrings.
replacements_dict (dict): A dictionary of replacement pairs, where the keys are the
substrings to be replaced and the values are the replacement strings.
Returns:
str: The input string with all instances of the substrings replaced with their
corresponding replacement strings.
"""
Easy to use extraction of text from PDF or RTF files:
def extract_rtf_pdf(name: str, get_ai_text:Callable=None)->str:
Find text from pdf and rtf
Args:
name (str): The string in which to find any format of dates
get_ai_text: call back function that can call google vision api or AWS or Azure equivalents for text extraction
Called for images and image PDFs.
Returns:
List[datetime.datetime]: A list of datetime objects the latest can be found using max()
"""
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for auto_find_date_pdf-0.1.24.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e5d10bab9e3ab6c29f505fc5176230685e0378d516c4205b357f7bf36aa0332 |
|
MD5 | e6e380ba3c70a1a332fc58a0d034d7d4 |
|
BLAKE2b-256 | 2b9dd057828c3018c7151095aa88c3026cfc2093bf5f9584ab19459d9b8f217a |
Close
Hashes for auto_find_date_pdf-0.1.24-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 547e9646561463b501bb58ee27e016c8192085157c1f32a801c7aad46134c731 |
|
MD5 | c389fad43e54fc57ad4b02c8f165d6e4 |
|
BLAKE2b-256 | eff85a8524e5f3ade5a6d647a58eb536d1cca5a1b274b60d97cf69eb6428ddad |