Skip to main content

A simple lib to find dates from any txt/ pdf/ docx/ rtf source. For documentation see

Project description

Simple use Date and text parsing from pdf rtf and images (with use of call back function)

This is a simple package provided by Marvsai healthcare LTD. It can find any format regular dates in a str as python Datetime objects.

Easy to use method

def find_dates(file_contents: str):

  Find any dates in a large python string usually taken from a file or pdf

  Args:
      file_contents (str): The string in which to find any format of dates


  Returns:
      List[datetime.datetime]: A list of datetime objects the latest can be found using max()

Optimised replacement of multiple strings in a string

replace_multiple_strings(input_string, replacements_dict)

Replace multiple strings in the input string using a dictionary of replacement pairs.

Args:
    input_string (str): The string in which to replace the substrings.
    replacements_dict (dict): A dictionary of replacement pairs, where the keys are the
        substrings to be replaced and the values are the replacement strings.

Returns:
    str: The input string with all instances of the substrings replaced with their
        corresponding replacement strings.
"""

Easy to use extraction of text from PDF or RTF files:

def extract_rtf_pdf(name: str, get_ai_text:Callable=None)->str:

  Find text from pdf and rtf

  Args:
      name (str): The string in which to find any format of dates
      get_ai_text: call back function that can call google vision api or AWS or Azure equivalents for text extraction
      Called for images and image PDFs.

  Returns:
      List[datetime.datetime]: A list of datetime objects the latest can be found using max()
  """

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_find_date_pdf-0.2.22.tar.gz (5.2 kB view hashes)

Uploaded Source

Built Distribution

auto_find_date_pdf-0.2.22-py3-none-any.whl (5.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page