Skip to main content

A simple lib to find dates from any txt/ pdf/ docx/ rtf source. For documentation see

Project description

Simple use Date and text parsing from pdf rtf and images (with use of call back function)

This is a simple package provided by Marvsai healthcare LTD. It can find any format regular dates in a str as python Datetime objects.

Easy to use method

def find_dates(file_contents: str):

  Find any dates in a large python string usually taken from a file or pdf

  Args:
      file_contents (str): The string in which to find any format of dates


  Returns:
      List[datetime.datetime]: A list of datetime objects the latest can be found using max()

Optimised replacement of multiple strings in a string

replace_multiple_strings(input_string, replacements_dict)

Replace multiple strings in the input string using a dictionary of replacement pairs.

Args:
    input_string (str): The string in which to replace the substrings.
    replacements_dict (dict): A dictionary of replacement pairs, where the keys are the
        substrings to be replaced and the values are the replacement strings.

Returns:
    str: The input string with all instances of the substrings replaced with their
        corresponding replacement strings.
"""

Easy to use extraction of text from PDF or RTF files:

def extract_rtf_pdf(name: str, get_ai_text:Callable=None)->str:

  Find text from pdf and rtf

  Args:
      name (str): The string in which to find any format of dates
      get_ai_text: call back function that can call google vision api or AWS or Azure equivalents for text extraction
      Called for images and image PDFs.

  Returns:
      List[datetime.datetime]: A list of datetime objects the latest can be found using max()
  """

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_find_date_pdf-0.2.22.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

auto_find_date_pdf-0.2.22-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file auto_find_date_pdf-0.2.22.tar.gz.

File metadata

  • Download URL: auto_find_date_pdf-0.2.22.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for auto_find_date_pdf-0.2.22.tar.gz
Algorithm Hash digest
SHA256 2e1b3915759eea4313d344ff76ac77b1ea817529d5ca09e112374721d682e536
MD5 c5e9da1f5e0869a54cc95339f23d6db4
BLAKE2b-256 067cb8cd0473fa210b2dc15fd423292ec9f4b7fa1fe12f6a0dc82cf4cc9b0f40

See more details on using hashes here.

File details

Details for the file auto_find_date_pdf-0.2.22-py3-none-any.whl.

File metadata

File hashes

Hashes for auto_find_date_pdf-0.2.22-py3-none-any.whl
Algorithm Hash digest
SHA256 c9e0de9e4cc56582e475f715467f73723f5e00539c995392bc5f1869d325967a
MD5 caa82a09e705ede3466c00f06eee6ab3
BLAKE2b-256 e4268b0da77e94bf38e511571e381d36f0a112169ec27ca84594280cd93d8b45

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page