A simple lib to find dates from any txt/ pdf/ docx/ rtf source. For documentation see
Project description
Simple use Date and text parsing from pdf rtf and images (with use of call back function)
This is a simple package provided by Marvsai healthcare LTD. It can find any format regular dates in a str as python Datetime objects.
Easy to use method
def find_dates(file_contents: str):
Find any dates in a large python string usually taken from a file or pdf
Args:
file_contents (str): The string in which to find any format of dates
Returns:
List[datetime.datetime]: A list of datetime objects the latest can be found using max()
Optimised replacement of multiple strings in a string
replace_multiple_strings(input_string, replacements_dict)
Replace multiple strings in the input string using a dictionary of replacement pairs.
Args:
input_string (str): The string in which to replace the substrings.
replacements_dict (dict): A dictionary of replacement pairs, where the keys are the
substrings to be replaced and the values are the replacement strings.
Returns:
str: The input string with all instances of the substrings replaced with their
corresponding replacement strings.
"""
Easy to use extraction of text from PDF or RTF files:
def extract_rtf_pdf(name: str, get_ai_text:Callable=None)->str:
Find text from pdf and rtf
Args:
name (str): The string in which to find any format of dates
get_ai_text: call back function that can call google vision api or AWS or Azure equivalents for text extraction
Called for images and image PDFs.
Returns:
List[datetime.datetime]: A list of datetime objects the latest can be found using max()
"""
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file auto_find_date_pdf-0.2.22.tar.gz
.
File metadata
- Download URL: auto_find_date_pdf-0.2.22.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e1b3915759eea4313d344ff76ac77b1ea817529d5ca09e112374721d682e536 |
|
MD5 | c5e9da1f5e0869a54cc95339f23d6db4 |
|
BLAKE2b-256 | 067cb8cd0473fa210b2dc15fd423292ec9f4b7fa1fe12f6a0dc82cf4cc9b0f40 |
File details
Details for the file auto_find_date_pdf-0.2.22-py3-none-any.whl
.
File metadata
- Download URL: auto_find_date_pdf-0.2.22-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9e0de9e4cc56582e475f715467f73723f5e00539c995392bc5f1869d325967a |
|
MD5 | caa82a09e705ede3466c00f06eee6ab3 |
|
BLAKE2b-256 | e4268b0da77e94bf38e511571e381d36f0a112169ec27ca84594280cd93d8b45 |