A lightweight package for fuzzy word/phrase searches in a body of text, using a very simple token system.
Project description
fuzzquery
A lightweight package for fuzzy word/phrase searches in a body of text, using a very simple token system.
Queries:
Tokens are used to represent unknown/fuzzy data. The 3 types of tokens are:
token | type | description | example | result-like |
---|---|---|---|---|
{x} |
range | 0 to x non-whitespace characters |
"home{5}" |
home, homestead, homeward |
{!x} |
strict | exactly x non-whitespace characters |
"{1}ward{!2}" |
warden, awarded |
{?} |
unknown | 0 or more unknown words | "thou {?} kill" |
thou shalt not kill |
The unknown
token must be segregated in the space between any 2 terms, exactly as illustrated in the above example.
Documentation:
note:
list|tuple|set
is aliased as Iter
to simplify documentation. There is no Iter
type in the fuzzquery
package.
finditer(text, query, skip, ci)
yield all (
span
,match
) of a single query.
arg | description | type |
---|---|---|
text |
the text to search | str |
query |
the query to search for | str |
skip |
terms and/or characters that trigger a skip when found in results | Iter|None |
ci |
case-insensitive matching | bool |
findany(text, queries, skip, ci)
OR
queries together and yield all (span
,match
) of "whatever-is-next".
arg | description | type |
---|---|---|
text |
the text to search | str |
queries |
queries to combine for "whatever-is-next" search | Iter |
skip |
terms and/or characters that trigger a skip when found in results | Iter|None |
ci |
case-insensitive matching | bool |
iterall(text, queries, skip, ci)
yield all (
query
,span
,match
) of multiple queries.
arg | description | type |
---|---|---|
text |
the text to search | str |
queries |
queries to search for | Iter |
skip |
terms and/or characters that trigger a skip when found in results | Iter|None |
ci |
case-insensitive matching | bool |
Examples:
import fuzzquery as fq
data = """
I headed homeward to meet with the Wardens.
When I arrived, I was greeted by a homely man that told me the homestead was awarded 5 million dollars.
We intend to use some of the homage to create a homeless ward.
The first piece of furniture will be my late-friend Homer's wardrobe.
"""
queries = ('hom{5} {?} wa{!1}{5}',
'home{5}',
'{1}ward{!2}{2}',
'home{4} ward{4}')
for query, span, match in fq.iterall(data, queries, ci=True):
if query: print(f'\n{query.upper()}')
print(f' {match}')
output
HOM{5} {?} WA{!1}{5}
homeward to meet with the Wardens
homely man that told me the homestead was
homage to create a homeless ward
Homer's wardrobe
HOME{5}
homeward
homely
homestead
homeless
Homer's
{1}WARD{!2}{2}
Wardens
awarded
wardrobe
HOME{4} WARD{4}
homeless ward
Homer's wardrobe
import fuzzquery as fq
data = """
I would classify music as one of my favorite hobbies.
I love classical music played by classy musicians for a classic musical.
Beethoven can not be out-classed, music-wise - a man of class, musically gifted.
"""
query = 'class{4} music{4}'
print(f'\n{query.upper()} with skip')
for span, match in fq.finditer(data, query, skip=('classify', ','), ci=True):
print(f' {match}')
print(f'\n{query.upper()} no skip')
for span, match in fq.finditer(data, query, ci=True):
print(f' {match}')
output
CLASS{4} MUSIC{4} with skip
classical music
classy musicians
classic musical
CLASS{4} MUSIC{4} no skip
classify music
classical music
classy musicians
classic musical
classed, music
class, musically
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fuzzquery-24.5.22.tar.gz
.
File metadata
- Download URL: fuzzquery-24.5.22.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a153a0ed34c94399fadb2b40064815ba8a0728702159c7687bf046a34cfbceee |
|
MD5 | 14de0e1777deb5867f3a0722cf61f130 |
|
BLAKE2b-256 | b55ac6725462d10bb2a1afad1bd6c0c33640721e6202b890ce63e6ea13362bcb |
File details
Details for the file fuzzquery-24.5.22-py3-none-any.whl
.
File metadata
- Download URL: fuzzquery-24.5.22-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f52d88662f5d47666a31dd50fd6131a74038729395006d0f038f4b6b8a4b236 |
|
MD5 | 5aa695d52c0e59a5a040a78e56dae332 |
|
BLAKE2b-256 | 5cf28cd7632025aa3bdcda23a20c394fb7a06d8c23751a9e05837dcbc2ec0c26 |