Skip to main content

A lightweight package for fuzzy word/phrase searches in a body of text, using a very simple token system.

Project description

fuzzquery

A lightweight package for fuzzy word/phrase searches in a body of text, using a very simple token system.


Queries:

Tokens are used to represent unknown/fuzzy data. The 3 types of tokens are:

token type description example result-like
{x} range 0 to x non-whitespace characters "home{5}" home, homestead, homeward
{!x} strict exactly x non-whitespace characters "{1}ward{!2}" warden, awarded
{?} unknown 0 or more unknown words "thou {?} kill" thou shalt not kill

The unknown token must be segregated in the space between any 2 terms, exactly as illustrated in the above example.


Documentation:

note: list|tuple|set is aliased as Iter to simplify documentation. There is no Iter type in the fuzzquery package.


finditer(text, query, skip, ci)

yield all (span, match) of a single query.

arg description type
text the text to search str
query the query to search for str
skip terms and/or characters that trigger a skip when found in results Iter|None
ci case-insensitive matching bool

findany(text, queries, skip, ci)

OR queries together and yield all (span, match) of "whatever-is-next".

arg description type
text the text to search str
queries queries to combine for "whatever-is-next" search Iter
skip terms and/or characters that trigger a skip when found in results Iter|None
ci case-insensitive matching bool

iterall(text, queries, skip, ci)

yield all (query, span, match) of multiple queries.

arg description type
text the text to search str
queries queries to search for Iter
skip terms and/or characters that trigger a skip when found in results Iter|None
ci case-insensitive matching bool

Examples:

import fuzzquery as fq

data = """ 
I headed homeward to meet with the Wardens. 
When I arrived, I was greeted by a homely man that told me the homestead was awarded 5 million dollars.
We intend to use some of the homage to create a homeless ward. 
The first piece of furniture will be my late-friend Homer's wardrobe.
"""
queries = ('hom{5} {?} wa{!1}{5}', 
           'home{5}', 
           '{1}ward{!2}{2}', 
           'home{4} ward{4}')

for query, span, match in fq.iterall(data, queries, ci=True):
    if query: print(f'\n{query.upper()}')
    print(f'  {match}')

output

HOM{5} {?} WA{!1}{5}
  homeward to meet with the Wardens
  homely man that told me the homestead was
  homage to create a homeless ward
  Homer's wardrobe

HOME{5}
  homeward
  homely
  homestead
  homeless
  Homer's

{1}WARD{!2}{2}
  Wardens
  awarded
  wardrobe

HOME{4} WARD{4}
  homeless ward
  Homer's wardrobe

import fuzzquery as fq

data = """ 
I would classify music as one of my favorite hobbies. 
I love classical music played by classy musicians for a classic musical. 
Beethoven can not be out-classed, music-wise - a man of class, musically gifted.
"""
query = 'class{4} music{4}'

print(f'\n{query.upper()} with skip')
for span, match in fq.finditer(data, query, skip=('classify', ','), ci=True):
    print(f'  {match}')
    
print(f'\n{query.upper()} no skip')
for span, match in fq.finditer(data, query, ci=True):
    print(f'  {match}')

output

CLASS{4} MUSIC{4} with skip
  classical music
  classy musicians
  classic musical

CLASS{4} MUSIC{4} no skip
  classify music
  classical music
  classy musicians
  classic musical
  classed, music
  class, musically

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fuzzquery-24.5.22.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

fuzzquery-24.5.22-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file fuzzquery-24.5.22.tar.gz.

File metadata

  • Download URL: fuzzquery-24.5.22.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.0

File hashes

Hashes for fuzzquery-24.5.22.tar.gz
Algorithm Hash digest
SHA256 a153a0ed34c94399fadb2b40064815ba8a0728702159c7687bf046a34cfbceee
MD5 14de0e1777deb5867f3a0722cf61f130
BLAKE2b-256 b55ac6725462d10bb2a1afad1bd6c0c33640721e6202b890ce63e6ea13362bcb

See more details on using hashes here.

File details

Details for the file fuzzquery-24.5.22-py3-none-any.whl.

File metadata

  • Download URL: fuzzquery-24.5.22-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.0

File hashes

Hashes for fuzzquery-24.5.22-py3-none-any.whl
Algorithm Hash digest
SHA256 9f52d88662f5d47666a31dd50fd6131a74038729395006d0f038f4b6b8a4b236
MD5 5aa695d52c0e59a5a040a78e56dae332
BLAKE2b-256 5cf28cd7632025aa3bdcda23a20c394fb7a06d8c23751a9e05837dcbc2ec0c26

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page