A simple lib to find dates from any txt/ pdf/ docx/ rtf source. For documentation see
Project description
Simple use Date and text parsing from pdf rtf and images (with use of call back function)
This is a simple package provided by Marvsai healthcare LTD. It can find any format regular dates in a str as python Datetime objects.
Easy to use method
def find_dates(file_contents: str):
Find any dates in a large python string usually taken from a file or pdf
Args:
file_contents (str): The string in which to find any format of dates
Returns:
List[datetime.datetime]: A list of datetime objects the latest can be found using max()
Optimised replacement of multiple strings in a string
replace_multiple_strings(input_string, replacements_dict)
Replace multiple strings in the input string using a dictionary of replacement pairs.
Args:
input_string (str): The string in which to replace the substrings.
replacements_dict (dict): A dictionary of replacement pairs, where the keys are the
substrings to be replaced and the values are the replacement strings.
Returns:
str: The input string with all instances of the substrings replaced with their
corresponding replacement strings.
"""
Easy to use extraction of text from PDF or RTF files:
def extract_rtf_pdf(name: str, get_ai_text:Callable=None)->str:
Find text from pdf and rtf
Args:
name (str): The string in which to find any format of dates
get_ai_text: call back function that can call google vision api or AWS or Azure equivalents for text extraction
Called for images and image PDFs.
Returns:
List[datetime.datetime]: A list of datetime objects the latest can be found using max()
"""
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for auto_find_date_pdf-0.2.22.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e1b3915759eea4313d344ff76ac77b1ea817529d5ca09e112374721d682e536 |
|
MD5 | c5e9da1f5e0869a54cc95339f23d6db4 |
|
BLAKE2b-256 | 067cb8cd0473fa210b2dc15fd423292ec9f4b7fa1fe12f6a0dc82cf4cc9b0f40 |
Close
Hashes for auto_find_date_pdf-0.2.22-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9e0de9e4cc56582e475f715467f73723f5e00539c995392bc5f1869d325967a |
|
MD5 | caa82a09e705ede3466c00f06eee6ab3 |
|
BLAKE2b-256 | e4268b0da77e94bf38e511571e381d36f0a112169ec27ca84594280cd93d8b45 |