Skip to main content

A fast fuzzy matching algorithm based on FFT.

Project description

cupy_fft_match

A fast fuzzy matching algorithm based on FFT.

Installation

pip install cupy_fft_match

What is Fuzzy Matching

Consider $A$ and $B$; they are arrays, and $P$ is an array of the same length as $B$.

For any $0\leq k < \text{len}(A)$, we want to efficiently calculate:

$$ \text{Match}(A, B, P)k = \sum{i=0}^{\text{len}(B)-1} (A_{k + i} - B_{i})^2\cdot P_i $$

in which $\text{Match}(A, B, P)$ is a generated array with length $\text{len}(A)-\text{len}(B)+1$.

To put it in a pragmatic way, let $A$ be a string or an 2D image, $B$ and $P$ is a template string/image that you want to match in $A$. Then, (the low value position of) $\text{Match}(A, B, P)$ gives the position in $A$ which suits $B$ well.

If you don't need any wildcard, just let $P_i=1$ for any $i$. The position $i$ that satisfies $P_i=0$ is a wildcard.

The interfaces provided in this project support arbitrary high-dimensional arrays. Therefore, you can use high-dimensional arrays for $A$, $B$, and $P$, the algorithm will work correctly as long as they have the same number of dimensions and the length of $A$ in each dimension is not less than that of $B$.

Since CuPy's ndarray is inherently different from NumPy's ndarray, please ensure that the data has been converted to the cupy.ndarray type before invoking the algorithms in this project.

Usage

import cupy as cp
import cupy_fft_match as cm

vec_a = cp.array([ ... ]) # vec_a is the text string
vec_b = cp.array([ ... ]) # vec_b is the template string, vec_b is usually "smaller" then vec_a
vec_p = cp.array([ ... ]) # vec_p is the weight string, which has the same size as vec_b
match_ans = cm.match_arr(vec_a, vec_b, vec_p)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cupy_fft_match-0.0.0.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cupy_fft_match-0.0.0-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file cupy_fft_match-0.0.0.tar.gz.

File metadata

  • Download URL: cupy_fft_match-0.0.0.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for cupy_fft_match-0.0.0.tar.gz
Algorithm Hash digest
SHA256 ea52ba55df8f7fa9b0d1864b539883f922bca7ff1e8a3d84a25cb341d0e876ac
MD5 9fac7dd23b8c347f3e6d16a212bf7a2e
BLAKE2b-256 9bf97cd2f89ce1b652d38037f847b65c975cd3f5aca166a3173c6d38e0eb3e15

See more details on using hashes here.

File details

Details for the file cupy_fft_match-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: cupy_fft_match-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.7

File hashes

Hashes for cupy_fft_match-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3f7055a7d1da8ae671bb5850c28703b32483941bff7766fcfb33d63fae1db8c5
MD5 73bcaf0c884b66e8c8131bb22efefd5f
BLAKE2b-256 453821de5c37388a413b3e29b35daa5709ecdeb6e5bd43ff83bd8bf285e25c54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page