Python/Cython Boyer-Moore string-search algorithm
Project description
pybmoore
Python/Cython implementation of Boyer-Moore string-search algorithm.
Installing
Install and update using uv:
uv pip install pybmoore
notice:
gcc
must be available on the system.
Usage
Single term
The search method in the pybmoore
module will return a list of tuples with all occurrences, where the tuple have the initial and final position. For example:
import pybmoore
TEXT = """The Boyer–Moore string-search algorithm is
an efficient string-searching algorithm that is the
standard benchmark for practical string-search literature.
"""
matches = pybmoore.search('string', TEXT)
print(f"Occurrences: {len(matches)}")
# output: Occurrences: 3
print(matches)
# output: [(16, 22), (57, 63), (130, 136)]
for x, y in matches:
print(f"({x},{y}) - {TEXT[x:y]}")
notice: search method it's case sensitive.
import pybmoore
TEXT = """The algorithm preprocesses the string being searched for (the pattern),
but not the string being searched in (the text). It is thus well-suited for
applications in which the pattern is much shorter than the text or where it
persists across multiple searches.
"""
pybmoore.search('algorithm', TEXT)
# output: [(4, 13)]
pybmoore.search('Algorithm', TEXT)
# output: []
Multiple terms
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
import pybmoore
TEXT = """The Boyer-Moore algorithm searches for occurrences of P in T by
performing explicit character comparisons at different alignments. Instead of a
brute-force search of all alignments (of which there are m − n + 1, Boyer-Moore
uses information gained by preprocessing P to skip as many alignments as possible.
"""
# Using a list of patterns
pybmoore.search_m(['brute-force', 'Boyer-Moore'], TEXT, ProcessPoolExecutor)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}
# Using a set of patterns
pybmoore.search_m({'brute-force', 'Boyer-Moore'}, TEXT, ThreadPoolExecutor)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}
# Using a tuple of patterns
pybmoore.search_m(('brute-force', 'Boyer-Moore'), TEXT, ThreadPoolExecutor, max_workers=4)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}
Details
For granular control of the pool, use the parameters listed in the module documentation. For example:
Development
To build pybmoore locally first install requirements-dev.txt
dependencies and run:
make build # without Cython
make build USE_CYTHON=1 # with Cython
in some cases it's necesary run
make clean
beforemake build
.
Type make
in the command line to see all available targets.
Links
- License: Apache License
- Code: https://github.com/amenezes/pybmoore
- Issue tracker: https://github.com/amenezes/pybmoore/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pybmoore-2.0.1.tar.gz
.
File metadata
- Download URL: pybmoore-2.0.1.tar.gz
- Upload date:
- Size: 66.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.13.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5ba84536cf95bab2f27006dacfe9d85d7a31fe50b052982d64de027a891f0b4 |
|
MD5 | 8ff0490e4f7e2c1e16a1eeacf923e0d4 |
|
BLAKE2b-256 | d1baf8587e97b51a7ad12f858f227bcb7f75b72ad84d9dbae39a70e80f306ad0 |