Skip to main content

Python/Cython Boyer-Moore string-search algorithm

Project description

Build Status codecov PyPI version PyPI - Python Version Code style: black

pybmoore

Python/Cython implementation of Boyer-Moore string-search algorithm.

Installing

Install and update using uv:

uv pip install pybmoore

notice: gcc must be available on the system.

Usage

Single term

The search method in the pybmoore module will return a list of tuples with all occurrences, where the tuple have the initial and final position. For example:

import pybmoore


TEXT = """The Boyer–Moore string-search algorithm is 
an efficient string-searching algorithm that is the 
standard benchmark for practical string-search literature.
"""

matches = pybmoore.search('string', TEXT)
print(f"Occurrences: {len(matches)}")
# output: Occurrences: 3

print(matches)
# output: [(16, 22), (57, 63), (130, 136)]

for x, y in matches:
    print(f"({x},{y}) - {TEXT[x:y]}")

notice: search method it's case sensitive.

import pybmoore


TEXT = """The algorithm preprocesses the string being searched for (the pattern), 
but not the string being searched in (the text). It is thus well-suited for 
applications in which the pattern is much shorter than the text or where it 
persists across multiple searches.
"""

pybmoore.search('algorithm', TEXT)
# output: [(4, 13)]

pybmoore.search('Algorithm', TEXT)
# output: []

Multiple terms

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor

import pybmoore


TEXT = """The Boyer-Moore algorithm searches for occurrences of P in T by 
performing explicit character comparisons at different alignments. Instead of a 
brute-force search of all alignments (of which there are m − n + 1, Boyer-Moore 
uses information gained by preprocessing P to skip as many alignments as possible.
"""

# Using a list of patterns
pybmoore.search_m(['brute-force', 'Boyer-Moore'], TEXT, ProcessPoolExecutor)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}

# Using a set of patterns
pybmoore.search_m({'brute-force', 'Boyer-Moore'}, TEXT, ThreadPoolExecutor)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}

# Using a tuple of patterns
pybmoore.search_m(('brute-force', 'Boyer-Moore'), TEXT, ThreadPoolExecutor, max_workers=4)
# output: {'brute-force': [(146, 157)], 'Boyer-Moore': [(4, 15), (214, 225)]}

Details

For granular control of the pool, use the parameters listed in the module documentation. For example:

Development

To build pybmoore locally first install requirements-dev.txt dependencies and run:

make build # without Cython

make build USE_CYTHON=1 # with Cython

in some cases it's necesary run make clean before make build.

Type make in the command line to see all available targets.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybmoore-2.1.0.tar.gz (61.2 kB view details)

Uploaded Source

File details

Details for the file pybmoore-2.1.0.tar.gz.

File metadata

  • Download URL: pybmoore-2.1.0.tar.gz
  • Upload date:
  • Size: 61.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for pybmoore-2.1.0.tar.gz
Algorithm Hash digest
SHA256 d95371311e60b2e59a8dab609838799ca22831fc6f9d14199a560f5d9b5e7d1d
MD5 cb1ace58d24d49fb1eb2f0d08cb47eb5
BLAKE2b-256 0d6a420a0ee7459bd141e81e049527574eecaf184e382dfdceabdd1c4a07de2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page