A package for matching a set of strings and textual patterns in a given text file
Project description
pystringmatcher
description
a small utility tool for finding substrings and text patterns in an input file the tool is cutting the text in the file into chunks and processes every chunk in a separate process using python's multiprocessing module
installation:
pip install pystringmatcher
usage:
- using the python module
python -m py pyringmatcher -h
Finding text patterns in input text file
optional arguments:
-h, --help show this help message and exit
-f FILE_PATH, --file FILE_PATH
the input file to search the patterns in
-p PATTERNS, --patterns PATTERNS
the pattern\s to search in the file separated by ,
-n NUM_LINES_PER_CHUNK, --num-lines NUM_LINES_PER_CHUNK
the number of lines per chunk of text from the input file
- or by using the included console script
stringmatcher -h
- In your own program
from pystringmatcher.Algorithms import RabinKarp
from pystringmatcher.Types import TextFile
try:
text = TextFile(file_path="/path/to/file")
algorithm = RabinKarp()
chunks = text.divide_into_chunks(num_of_lines_each_chunk=1000)
patterns = "alpha,beta,charlie,delta,echo,foxtrot".split(",")
print(f"[X] - Start finding the patterns : {patterns} in the file: {text}")
matches = text.find_matches(chunks=chunks, patterns=patterns, algorithm=algorithm)
if matches:
print("Found matches")
print(matches)
print("No matches were found")
except FileNotFoundError:
print(f"The file: {text} was not found and may not exist")
- Implementing your own matching algorithm
from pystringmatcher.Algorithms import Algorithm
from pystringmatcher.Types import Match
class MyAlgorithm(Algorithm):
def preprocess(self, pattern, text, *args, **kwargs):
"""some preprocess logic goes here if needed"""
def run(self, pattern, text, *args, **kwargs):
matches = []
"""the mathcing algorithm logic goes here
for any match: matches.append(Match(char_offset=start_index_of_match))
"""
return matches
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pystringmatcher-0.0.9.tar.gz.
File metadata
- Download URL: pystringmatcher-0.0.9.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2546e1acbb9381ef7057af4cbb5d9f0d2fe2a0778bb8fef618c2e7b878ea3a3
|
|
| MD5 |
127f8fa4bf643d1c9828bf0f0a6f0c23
|
|
| BLAKE2b-256 |
c3954244114cc6d2d45d56150a82ab3baab0c418bf0e8b30d343d09e99492882
|
File details
Details for the file pystringmatcher-0.0.9-py3-none-any.whl.
File metadata
- Download URL: pystringmatcher-0.0.9-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf482c89486a0f237c1bab99e5ee0ba5b5e0382cbf9fbac1c94088104c9d95a0
|
|
| MD5 |
3ec8f8f329a35e11b94e7866994809c5
|
|
| BLAKE2b-256 |
a156fbe433c70112f829e2ff3ab6f97c283075d6b8c6cfb9d7c15cf403c7d4b2
|