Composable Python search with per-field match strategies and a Q expression DSL

Project description

Srxy

Smart, composable search for Python — and your filesystem.

Pass any list of objects (dicts, dataclasses, Pydantic models) and find what you mean, not just what you typed. Fuzzy, phonetic, and composite matching out of the box. Search files by name or content from Python or the terminal.

pip install srxy

Why Srxy?


Magic search	One function call. Auto-discovers fields, blends matchers, ranks by score.
Field search + AND/OR	Per-field strategies with a fluent `Q` DSL — combine conditions with `&` and `\|`.
File search + CLI	Search paths by file name and/or content. Same smart matching, plus a `srxy` command.

Magic search

The fastest path to good results. magic_search auto-discovers fields from your items, runs composite matching on each, and keeps the best score (OR semantics). Typos, phonetic near-misses, and partial matches are handled for you.

from srxy import magic_search

items = [
    {"name": "salt"},
    {"name": "salty"},
    {"name": "salad"},
]

# Match across specific fields
results = magic_search(items, "salat", fields=["name"])
print(results[0].item["name"])  # salad
print(results[0].score)

# Or search every discoverable field (default)
results = magic_search(items, "salat")

Works with dicts, dataclasses, and Pydantic models. Default threshold is 0.25; tune it when you need stricter or looser matches.

Field search with AND / OR

When you need precision, use search with the Q expression DSL. Pick a match strategy per field, then wire them together with boolean logic.

from srxy import search, Q, FieldConfig, MatchType

# OR — match if any field scores well
search(items, "salat", where=Q.composite("name") | Q.contains("tags"))

# AND — every branch must clear the threshold
search(items, "spatial", where=Q.all(Q.composite("name"), Q.exact("status")))

# Nested — (sku OR barcode) AND label
search(
    items,
    "ABC-123",
    where=Q.any(Q.exact("sku"), Q.exact("barcode")) & Q.exact("label"),
)

Boolean scoring: OR uses max(child scores), AND uses min(child scores).

Prefer explicit config over the DSL? Pass a list of FieldConfig instead:

search(
    people,
    "engineer",
    fields=[
        FieldConfig("role", MatchType.EXACT, weight=2.0),
        FieldConfig("name", MatchType.CONTAINS, weight=1.0),
    ],
    threshold=0.5,
)

File search

Search filesystem paths by file name, file content, or both — no ML required. Directories are walked recursively. By default, dot-prefixed hidden entries and noise folders (__pycache__, node_modules) are skipped. Content search scores each line and returns matching line numbers.

Supported content formats: plain text, .pdf, .docx, .xlsx, and .pptx (text extracted automatically).

from pathlib import Path
from srxy import magic_file_search

results = magic_file_search(Path("./src"), "registry", threshold=0.3)
for result in results:
    print(result.path, result.score, result.breakdown)
    for line in result.lines:
        print(f"  line {line.line_number}: {line.text}")

# Include hidden directories and files (e.g. .git)
results = magic_file_search(Path("."), "token", skip_hidden_folders=False)

# Include noise directories (e.g. __pycache__, node_modules)
results = magic_file_search(Path("."), "token", skip_noise_folders=False)

# Search everywhere — disable both skip flags
results = magic_file_search(
    Path("."),
    "token",
    skip_hidden_folders=False,
    skip_noise_folders=False,
)

CLI

Search from the terminal after install:

# Search names and contents (grouped output)
srxy registry ./src

# Content only — shows line numbers
srxy revenue ./docs --content-only

# Flat, pipe-friendly output
srxy token ./src --format flat

# JSON for scripting
srxy budget . --json

# Search hidden directories and files (e.g. .git)
srxy token . --include-hidden

# Search noise directories (e.g. __pycache__, node_modules)
srxy token . --include-noise

# Search everywhere
srxy token . --include-hidden --include-noise

Options: --names-only, --content-only, --include-hidden, --include-noise, --threshold, --max-file-size, --max-line-matches, --semantic (opt-in ML). Exit codes: 0 matches found, 1 no matches, 2 usage/path error.

Match types

Type	Behavior
`EXACT`	Case-insensitive full string equality
`CONTAINS`	Substring match
`PARTIAL`	Prefix or suffix match
`FUZZY`	Character-level similarity (rapidfuzz)
`PHONETIC`	Sounds-alike (metaphone, soundex, NYSIIS with graduated scoring)
`SEMANTIC`	Meaning similarity (optional; see below)
`COMPOSITE`	Weighted blend of available atomic matchers (default smart mode)

Default composite weights: fuzzy 35%, semantic 20%, partial 15%, phonetic 12%, contains 10%, exact 8%. When semantic is disabled, composite skips it and renormalizes the remaining weights. Override per field via composite_weights on Q.composite(...) or FieldConfig.

Semantic matching (optional)

Semantic search is off by default. Opt in when you need meaning-based similarity:

export SRXY_SEMANTIC=1
pip install 'srxy[semantic]'

With SRXY_SEMANTIC=1, composite matching includes semantic similarity. Explicit Q.semantic(...) or MatchType.SEMANTIC raises a clear error if semantic is not enabled.

Default model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 (downloaded from Hugging Face on first use). For a local cache:

./scripts/download_semantic_model.sh
export SRXY_SEMANTIC_MODEL_PATH=~/.cache/srxy/semantic-model

Core dependencies (always installed): rapidfuzz and jellyfish (phonetic matching).

Development

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,semantic]"
./scripts/quality/checks.sh --fix
./scripts/quality/checks.sh

Quality gate: Ruff → ShellCheck/shfmt → basedpyright → pip-audit → build → pytest.

Local (./scripts/quality/checks.sh): runs all tests (unit + integration).
CI: runs only pytest -m unit (fast tests; no semantic model required).

Integration tests (requires pip install -e ".[semantic]" and SRXY_SEMANTIC=1, set automatically in tests/integration/conftest.py):

pytest -m integration

Integration tests load a curated news-style corpus from tests/fixtures/search_corpus.json and measure top-k hit rates.

Project details

Release history Release notifications | RSS feed

1.2.0

Jun 26, 2026

1.1.0

Jun 25, 2026

This version

1.0.0

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

srxy-1.0.0.tar.gz (19.0 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

srxy-1.0.0-py3-none-any.whl (20.3 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file srxy-1.0.0.tar.gz.

File metadata

Download URL: srxy-1.0.0.tar.gz
Upload date: Jun 25, 2026
Size: 19.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for srxy-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`667ae0ae1093c5a40dfac9fef8f0fb38d8ae8ef6f88748bece63c4681d280e43`
MD5	`18d9287c4c3ab9450e4034041170d55b`
BLAKE2b-256	`338911cfecd8ae1f026a0cd236ee09e720e2a46189f090072fe774bfa699290b`

See more details on using hashes here.

File details

Details for the file srxy-1.0.0-py3-none-any.whl.

File metadata

Download URL: srxy-1.0.0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 20.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for srxy-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5c13b7feb8b6225a056214069aa9405f65783412200e3290842afbfeb18fee3`
MD5	`2d9a4584101e18b2d754b7657d3981d2`
BLAKE2b-256	`ddd4948b4ee75311e27e9204a08e06de454193b6989edef2a363eab345e15c84`

See more details on using hashes here.

srxy 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Srxy

Why Srxy?

Magic search

Field search with AND / OR

File search

CLI

Match types

Semantic matching (optional)

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes