Library provided tools used to search files concurrently.
Project description
Searchkit
Python library providing tools to perform searches on files in parallel.
Search Types
Differest types of search are supported. Add one or more search definition to a FileSearcher
object, registering them against a file, directory or glob path. Results are collected and returned as a SearchResultsCollection
which provides different ways to retrieve results.
Simple Search
Uses the SearchDef
class and supports matching one or more patterns against each line in a file. Patterns are executed until the first match is found.
Sequence Search
Uses the SequenceSearchDef
class and supports matching strings over multiple lines by matching a start, end and optional body in between.
Installation
searchkit is packaged in pypi and can be installed as follows:
sudo apt install python3-pip
pip install searchkit
Example Usage
An example simple search is as follows:
from searchkit import FileSearcher, SearchDef
fname = 'foo.txt'
open(fname, 'w').write('the quick brown fox')
fs = FileSearcher()
fs.add(SearchDef(r'.+ \S+ (\S+) .+'), fname)
results = fs.run()
for r in results.find_by_path(fname):
print(r.get(1))
An example sequence search is as follows:
from searchkit import FileSearcher, SequenceSearchDef, SearchDef
content = """
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'foo'"""
fname = 'my.log'
open(fname, 'w').write(content)
start = SearchDef(r'Traceback')
body = SearchDef(r'.+')
# terminate sequence with start of next or EOF so no end def needed.
fs = FileSearcher()
fs.add(SequenceSearchDef(start, tag='myseq', body=body), fname)
results = fs.run()
for seq, results in results.find_sequence_by_tag('myseq').items():
for r in results:
if 'body' in r.tag:
print(r.get(0))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for searchkit-0.1.21-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f684ce42d2be2c44a9fe59fd10e6e3cb4d7f89655c7ae89ee4f1f16efd5645e |
|
MD5 | cf6fe35d4da6b7f742e60c7251fec7a8 |
|
BLAKE2b-256 | b23f133f6d10dbf4859f3c1b95a91b68a4fe473717329134abb0e094e33f4f69 |