Skip to main content

Python/Cython Boyer-Moore string-search algorithm

Project description

Build Status codecov PyPI version PyPI - Python Version Code style: black

pybmoore

Python/Cython implementation of Boyer-Moore string-search algorithm.

Installing

Install and update using uv:

uv pip install pybmoore

notice: gcc must be available on the system.

Usage

Single term

The search method in the pybmoore module will return a list of tuples with all occurrences, where the tuple have the initial and final position. For example:

import pybmoore


TEXT = """The Boyer–Moore string-search algorithm is 
an efficient string-searching algorithm that is the 
standard benchmark for practical string-search literature.
"""

matches = pybmoore.search('string', TEXT)
print(f"Occurrences: {len(matches)}")
# output: Occurrences: 3

print(matches)
# output: [(16, 22), (57, 63), (130, 136)]

for x, y in matches:
    print(f"({x},{y}) - {TEXT[x:y]}")

notice: search method it's case sensitive.

import pybmoore


TEXT = """The algorithm preprocesses the string being searched for (the pattern), 
but not the string being searched in (the text). It is thus well-suited for 
applications in which the pattern is much shorter than the text or where it 
persists across multiple searches.
"""

pybmoore.search('algorithm', TEXT)
# output: [(4, 13)]

pybmoore.search('Algorithm', TEXT)
# output: []

Multiple terms

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor

import pybmoore


TEXT = """The Boyer-Moore algorithm searches for occurrences of P in T by 
performing explicit character comparisons at different alignments. Instead of a 
brute-force search of all alignments (of which there are m − n + 1, Boyer-Moore 
uses information gained by preprocessing P to skip as many alignments as possible.
"""

# Using a list of patterns
pybmoore.search_m(['brute-force', 'Boyer-Moore'], TEXT, ProcessPoolExecutor)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}

# Using a set of patterns
pybmoore.search_m({'brute-force', 'Boyer-Moore'}, TEXT, ThreadPoolExecutor)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}

# Using a tuple of patterns
pybmoore.search_m(('brute-force', 'Boyer-Moore'), TEXT, ThreadPoolExecutor, max_workers=4)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}

Details

For granular control of the pool, use the parameters listed in the module documentation. For example:

Development

To build pybmoore locally first install requirements-dev.txt dependencies and run:

make build # without Cython

make build USE_CYTHON=1 # with Cython

in some cases it's necesary run make clean before make build.

Type make in the command line to see all available targets.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybmoore-2.0.1.tar.gz (66.8 kB view details)

Uploaded Source

File details

Details for the file pybmoore-2.0.1.tar.gz.

File metadata

  • Download URL: pybmoore-2.0.1.tar.gz
  • Upload date:
  • Size: 66.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for pybmoore-2.0.1.tar.gz
Algorithm Hash digest
SHA256 f5ba84536cf95bab2f27006dacfe9d85d7a31fe50b052982d64de027a891f0b4
MD5 8ff0490e4f7e2c1e16a1eeacf923e0d4
BLAKE2b-256 d1baf8587e97b51a7ad12f858f227bcb7f75b72ad84d9dbae39a70e80f306ad0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page