A package for matching a set of strings and textual patterns in a given text file
Project description
pystringmatcher
description
a small utility tool for finding substrings and text patterns in an input file the tool is cutting the text in the file into chunks and processes every chunk in a separate process using python's multiprocessing module
installation:
pip install pystringmatcher
usage:
- using the python module
python -m py pyringmatcher -h
Finding text patterns in input text file
optional arguments:
-h, --help show this help message and exit
-f FILE_PATH, --file FILE_PATH
the input file to search the patterns in
-p PATTERNS, --patterns PATTERNS
the pattern\s to search in the file separated by ,
-n NUM_LINES_PER_CHUNK, --num-lines NUM_LINES_PER_CHUNK
the number of lines per chunk of text from the input file
- or by using the included console script
stringmatcher -h
- In your own program
from pystringmatcher.Algorithms import RabinKarp
from pystringmatcher.Types import TextFile
try:
text = TextFile(file_path="/path/to/file")
algorithm = RabinKarp()
chunks = text.divide_into_chunks(num_of_lines_each_chunk=1000)
patterns = "alpha,beta,charlie,delta,echo,foxtrot".split(",")
print(f"[X] - Start finding the patterns : {patterns} in the file: {text}")
matches = text.find_matches(chunks=chunks, patterns=patterns, algorithm=algorithm)
if matches:
print("Found matches")
print(matches)
print("No matches were found")
except FileNotFoundError:
print(f"The file: {text} was not found and may not exist")
- Implementing your own matching algorithm
from pystringmatcher.Algorithms import Algorithm
from pystringmatcher.Types import Match
class MyAlgorithm(Algorithm):
def preprocess(self, pattern, text, *args, **kwargs):
"""some preprocess logic goes here if needed"""
def run(self, pattern, text, *args, **kwargs):
matches = []
"""the mathcing algorithm logic goes here
for any match: matches.append(Match(char_offset=start_index_of_match))
"""
return matches
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pystringmatcher-0.0.9.tar.gz
(9.5 kB
view hashes)
Built Distribution
Close
Hashes for pystringmatcher-0.0.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf482c89486a0f237c1bab99e5ee0ba5b5e0382cbf9fbac1c94088104c9d95a0 |
|
MD5 | 3ec8f8f329a35e11b94e7866994809c5 |
|
BLAKE2b-256 | a156fbe433c70112f829e2ff3ab6f97c283075d6b8c6cfb9d7c15cf403c7d4b2 |