In-process fzf/skim-style fuzzy finder for Python, implemented in Rust.

These details have not been verified by PyPI

Project links

Project description

skimmatch

skimmatch is an in-process fzf/skim-style fuzzy finder for Python, implemented in Rust.

It is designed for ranked abbreviation matching over a fixed list of candidate strings. You give it strings such as filenames, references, titles, symbols, or command labels; users type short abbreviation-style queries; skimmatch returns the best candidates, scores, and optional highlight positions.

from skimmatch import Matcher

candidates = [
    "Follmer and Schied, Stochastic Finance, 2011",
    "Mildenhall and Major, Pricing Insurance Risk",
    "Wang distortion risk measures",
    "Archive reference catalogue",
]

matcher = Matcher(candidates)

for result in matcher.search("wang distortion", limit=3):
    print(result)

Example result:

{
    "index": 2,
    "score": 260,
    "text": "Wang distortion risk measures",
    "matches": [0, 1, 2, 3, 5, 6, 7, 8, 9, 10],
}

Scores are backend scores where higher is better. The exact numeric value should be treated as ranking information, not as a stable cross-version metric.

What This Is

skimmatch solves the same broad problem as interactive fuzzy finders such as fzf and skim: finding good abbreviation matches quickly.

For example, a query like:

fs sf 2011

can match:

Follmer and Schied, Stochastic Finance, 2011

because the query characters and tokens appear in useful positions and in the right order.

This is different from edit-distance fuzzy matching. Libraries such as RapidFuzz, Levenshtein, or token-ratio matchers are excellent for typo correction, deduplication, OCR cleanup, and record linkage. skimmatch is aimed at fast candidate selection, interactive search, and highlightable abbreviation matching.

Features

In-process Python extension: no external fzf executable required.
Rust matching backends using SkimMatcherV2, nucleo-matcher, and frizbee.
Preloaded candidate lists for fast repeated queries.
Single-token and multi-token search modes.
Optional highlight indices for UI rendering.
Legacy tuple-returning APIs for compatibility with the earlier rustfuzz shape.
Structured Matcher.search(...) API for new code.
Backend argument already present, so future backends can be added without changing the public matcher classes.

Installation

When published on PyPI:

pip install skimmatch

From a local checkout:

uv pip install -e .

or build with maturin:

uv run maturin develop

The current package metadata targets Python 3.13 or newer.

Quick Start

Use Matcher for new code.

from skimmatch import Matcher

candidates = [
    "Buhlmann, Mathematical Methods in Risk Theory",
    "Cramer, Collective Risk Theory",
    "Mildenhall and Major, Pricing Insurance Risk",
    "Kaas, Goovaerts, Dhaene, and Denuit, Modern Actuarial Risk Theory",
]

matcher = Matcher(candidates)
results = matcher.search("risk theory", limit=5)

for result in results:
    print(result["index"], result["score"], result["text"])

By default, search:

splits the query on whitespace;
requires every query token to match;
returns up to 20 results;
includes candidate text;
includes highlight positions.

Structured API

matcher = Matcher(candidates, backend="nucleo")  # or "skim" or "frizbee"
results = matcher.search(
    query,
    limit=20,
    highlights=True,
    include_text=True,
    multi=True,
)

Each result is a dictionary containing:

{
    "index": 0,          # original candidate index
    "score": 123,       # backend score, higher is better
    "text": "...",      # included when include_text=True
    "matches": [0, 3],  # included when highlights=True
}

Parameters

query

The search string. In multi-token mode, whitespace-separated tokens are matched independently and every token must match the candidate.

limit

The maximum number of results to return. limit=0 returns an empty list.

highlights

When true, results include matches, a sorted and deduplicated list of matched positions. Turn this off when you only need ranking; score-only matching does less work.

include_text

When true, each result includes the original candidate string. Turn this off if you already have the candidate list and want smaller result objects.

multi

When true, the query is split on whitespace and all tokens are required. When false, the whole query is sent to the matcher as one pattern.

Legacy APIs

The package also exports compatibility classes with tuple return shapes:

from skimmatch import FuzzyMatcher, FuzzyMatcherMulti, FuzzyMatcherMultiHi

`FuzzyMatcher`

Treats the whole query as one pattern.

matcher = FuzzyMatcher(candidates)
indices, scores = matcher.query("sf", top_k=10)

`FuzzyMatcherMulti`

Splits the query on whitespace. Every token must match.

matcher = FuzzyMatcherMulti(candidates)
indices, scores = matcher.query("pricing insurance", top_k=10)

`FuzzyMatcherMultiHi`

Like FuzzyMatcherMulti, but also returns highlight positions.

matcher = FuzzyMatcherMultiHi(candidates)
indices, scores, highlights = matcher.query("pricing insurance", top_k=10)

Matching Behavior

The available backends are:

backend="skim"
backend="nucleo"
backend="frizbee"

backend="skim" uses SkimMatcherV2 from the Rust fuzzy-matcher crate and is kept for compatibility.

backend="nucleo" uses nucleo-matcher, the lower-level matcher from the nucleo ecosystem. It is the default backend. It is a modern fzf-like backend and may rank candidates differently from skim. Scores are backend-specific and should not be compared between backends.

backend="frizbee" uses frizbee, a SIMD matcher with typo-resistant matching support. skimmatch currently runs it with typo tolerance disabled for a closer comparison with the other fzf-style backends. It matches against bytes, so highlight lists are intentionally empty for this backend until Unicode offset semantics are defined.

Good matches tend to reward:

characters appearing in order;
compact alignments;
word-boundary matches;
punctuation-separated and camel-case transitions;
early matches;
consecutive query-character matches;
candidates that match every query token in multi-token mode.

skimmatch returns candidates sorted by descending score. Ties are ordered by the original candidate index for deterministic output.

When To Use It

skimmatch is a good fit for:

command palettes;
file pickers;
bibliography and reference search;
symbol search;
autocomplete over known labels;
terminal or web UI candidate selection;
fast repeated queries over a preloaded list.

It is probably not the right tool for:

typo correction;
deduplication;
record linkage;
token-sort similarity;
OCR cleanup;
semantic search;
embedding-based retrieval.

Those are useful problems, but they are different from fzf/skim-style abbreviation matching.

Performance Notes

Candidate strings are copied into Rust once when the matcher is constructed. Repeated calls to query or search scan that Rust-owned list and return only the final top results to Python.

For best performance:

construct one matcher and reuse it across queries;
set highlights=False when you only need indices and scores;
set include_text=False when you already have the candidate strings;
use limit to keep returned result objects small.

Development

This project is a Python package with a Rust extension built by maturin.

Run the tests:

uv run pytest tests/test_skimmatch.py -q

Check Rust formatting:

cargo fmt --check

Important files:

src/lib.rs: Rust/PyO3 extension implementation.
python/skimmatch/__init__.py: Python re-exports.
tests/test_skimmatch.py: API and behavior tests.
pyproject.toml: Python packaging and maturin configuration.
Cargo.toml: Rust crate configuration.

Backend Roadmap

The public API accepts a backend argument. Today "skim", "nucleo", and "frizbee" are implemented. frizbee is experimental and currently exposes score/ranking behavior without highlight positions.

Unknown backend names currently raise ValueError.

License

MIT.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

May 18, 2026

This version

0.2.0

May 17, 2026

0.1.0

May 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skimmatch-0.2.0.tar.gz (1.7 MB view details)

Uploaded May 17, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

skimmatch-0.2.0-cp314-cp314-win_amd64.whl (269.2 kB view details)

Uploaded May 17, 2026 CPython 3.14Windows x86-64

skimmatch-0.2.0-cp313-cp313-win_amd64.whl (269.4 kB view details)

Uploaded May 17, 2026 CPython 3.13Windows x86-64

File details

Details for the file skimmatch-0.2.0.tar.gz.

File metadata

Download URL: skimmatch-0.2.0.tar.gz
Upload date: May 17, 2026
Size: 1.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for skimmatch-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`33301d0b5a062b0479f5dffe02ca1b456692f87c0f65192ac3d92ca09292c618`
MD5	`e9b56c2d7c94d85f9e7b9ec7056b0349`
BLAKE2b-256	`6bf0ca14d5adf150f80359670e9d05cdcfc536f42bcfa5884ced82f2fa7fa765`

See more details on using hashes here.

File details

Details for the file skimmatch-0.2.0-cp314-cp314-win_amd64.whl.

File metadata

Download URL: skimmatch-0.2.0-cp314-cp314-win_amd64.whl
Upload date: May 17, 2026
Size: 269.2 kB
Tags: CPython 3.14, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for skimmatch-0.2.0-cp314-cp314-win_amd64.whl
Algorithm	Hash digest
SHA256	`195a4d257c4636d88020e9428d32b3664dd1984c5d70d32ca8f4e34764a0c61e`
MD5	`fbd142da8fcc0dc8ace97b5fa9d5da89`
BLAKE2b-256	`3aa6d703538c0505fcfd556a39582d4229491f610db25547ea3eaae66f2ba4a3`

See more details on using hashes here.

File details

Details for the file skimmatch-0.2.0-cp313-cp313-win_amd64.whl.

File metadata

Download URL: skimmatch-0.2.0-cp313-cp313-win_amd64.whl
Upload date: May 17, 2026
Size: 269.4 kB
Tags: CPython 3.13, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for skimmatch-0.2.0-cp313-cp313-win_amd64.whl
Algorithm	Hash digest
SHA256	`a2c8ad972388c6deac42f4aa3ea1568224515e45dae25531dbc56acbfed9a93e`
MD5	`dabd5c74f5a1bd6f9846ae93aee24154`
BLAKE2b-256	`3acc103ffb7f87db07424021f437a212b5a0dc16014e8533d818d0a023affcb5`

See more details on using hashes here.

skimmatch 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

skimmatch

What This Is

Features

Installation

Quick Start

Structured API

Parameters

Legacy APIs

FuzzyMatcher

FuzzyMatcherMulti

FuzzyMatcherMultiHi

Matching Behavior

When To Use It

Performance Notes

Development

Backend Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

`FuzzyMatcher`

`FuzzyMatcherMulti`

`FuzzyMatcherMultiHi`