Skip to main content

A package for matching a set of strings and textual patterns in a given text file

Project description

badge PyPI version fury.io PyPI pyversions Ask Me Anything !

pystringmatcher

description

a small utility tool for finding substrings and text patterns in an input file the tool is cutting the text in the file into chunks and processes every chunk in a separate process using python's multiprocessing module

installation:

pip install pystringmatcher

usage:

  • using the python module
python -m py pyringmatcher -h

Finding text patterns in input text file

optional arguments:
  -h, --help            show this help message and exit
  -f FILE_PATH, --file FILE_PATH
                        the input file to search the patterns in
  -p PATTERNS, --patterns PATTERNS
                        the pattern\s to search in the file separated by ,
  -n NUM_LINES_PER_CHUNK, --num-lines NUM_LINES_PER_CHUNK
                        the number of lines per chunk of text from the input file
  • or by using the included console script
stringmatcher -h 
  • In your own program
from pystringmatcher.Algorithms import RabinKarp
from pystringmatcher.Types import TextFile


try:
    text = TextFile(file_path="/path/to/file")
    algorithm = RabinKarp()
    chunks = text.divide_into_chunks(num_of_lines_each_chunk=1000)
    patterns = "alpha,beta,charlie,delta,echo,foxtrot".split(",")
    print(f"[X] - Start finding the patterns : {patterns} in the file: {text}")
    matches = text.find_matches(chunks=chunks, patterns=patterns, algorithm=algorithm)

    if matches:
        print("Found matches")
        print(matches)

    print("No matches were found")
except FileNotFoundError:
    print(f"The file: {text} was not found and may not exist")
  • Implementing your own matching algorithm
from pystringmatcher.Algorithms import Algorithm
from pystringmatcher.Types import Match


class MyAlgorithm(Algorithm):

    def preprocess(self, pattern, text, *args, **kwargs):
        """some preprocess logic goes here if needed"""

    def run(self, pattern, text, *args, **kwargs):
        matches = []
        """the mathcing algorithm logic goes here
        for any match: matches.append(Match(char_offset=start_index_of_match)) 
        """         
        return matches

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pystringmatcher-0.0.9.tar.gz (9.5 kB view hashes)

Uploaded Source

Built Distribution

pystringmatcher-0.0.9-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page