Composable Python search with per-field match strategies and a Q expression DSL
Project description
Srxy
Smart, composable search for Python — and your filesystem.
Pass any list of objects (dicts, dataclasses, Pydantic models) and find what you mean, not just what you typed. Fuzzy, phonetic, and composite matching out of the box. Search files by name or content from Python or the terminal.
Installation
Use as a library (in a project or virtualenv):
pip install srxy
Use the CLI globally (recommended for terminal use):
pipx install srxy
If you don't have pipx yet, see the pipx installation guide.
Why Srxy?
| Magic search | One function call. Auto-discovers fields, blends matchers, ranks by score. |
| Field search + AND/OR | Per-field strategies with a fluent Q DSL — combine conditions with & and |. |
| File search + CLI | Search paths by file name and/or content. Same smart matching, plus a srxy command. |
Magic search
The fastest path to good results. magic_search auto-discovers fields from your items, runs composite matching on each, and keeps the best score (OR semantics). Typos, phonetic near-misses, and partial matches are handled for you.
from srxy import magic_search
items = [
{"name": "salt"},
{"name": "salty"},
{"name": "salad"},
]
# Match across specific fields
results = magic_search(items, "salat", fields=["name"])
print(results[0].item["name"]) # salad
print(results[0].score)
# Or search every discoverable field (default)
results = magic_search(items, "salat")
Works with dicts, dataclasses, and Pydantic models. Default threshold is 0.25; tune it when you need stricter or looser matches.
Field search with AND / OR
When you need precision, use search with the Q expression DSL. Pick a match strategy per field, then wire them together with boolean logic.
from srxy import search, Q, FieldConfig, MatchType
# OR — match if any field scores well
search(items, "salat", where=Q.composite("name") | Q.contains("tags"))
# AND — every branch must clear the threshold
search(items, "spatial", where=Q.all(Q.composite("name"), Q.exact("status")))
# Nested — (sku OR barcode) AND label
search(
items,
"ABC-123",
where=Q.any(Q.exact("sku"), Q.exact("barcode")) & Q.exact("label"),
)
Boolean scoring: OR uses max(child scores), AND uses min(child scores).
Prefer explicit config over the DSL? Pass a list of FieldConfig instead:
search(
people,
"engineer",
fields=[
FieldConfig("role", MatchType.EXACT, weight=2.0),
FieldConfig("name", MatchType.CONTAINS, weight=1.0),
],
threshold=0.5,
)
File search
Search filesystem paths by file name, file content, or both — no ML required. Directories are walked recursively. By default, dot-prefixed hidden entries and noise folders (__pycache__, node_modules) are skipped. Content search scores each line and returns matching line numbers.
Supported content formats: plain text, .pdf, .docx, .xlsx, and .pptx (text extracted automatically).
from pathlib import Path
from srxy import magic_file_search
results = magic_file_search(Path("./src"), "registry", threshold=0.3)
for result in results:
print(result.path, result.score, result.breakdown)
for line in result.lines:
print(f" line {line.line_number}: {line.text}")
# Include hidden directories and files (e.g. .git)
results = magic_file_search(Path("."), "token", skip_hidden_folders=False)
# Include noise directories (e.g. __pycache__, node_modules)
results = magic_file_search(Path("."), "token", skip_noise_folders=False)
# Search everywhere — disable both skip flags
results = magic_file_search(
Path("."),
"token",
skip_hidden_folders=False,
skip_noise_folders=False,
)
CLI
Install with pipx for a global srxy command (pipx install srxy), then search from the terminal:
# Search names and contents (grouped output)
srxy registry ./src
# Content only — shows line numbers
srxy revenue ./docs --content-only
# Flat, pipe-friendly output
srxy token ./src --format flat
# JSON for scripting
srxy budget . --json
# Search hidden directories and files (e.g. .git)
srxy token . --include-hidden
# Search noise directories (e.g. __pycache__, node_modules)
srxy token . --include-noise
# Search everywhere
srxy token . --include-hidden --include-noise
Options: --names-only, --content-only, --include-hidden, --include-noise, --threshold, --max-file-size, --max-line-matches, --semantic (opt-in ML). Exit codes: 0 matches found, 1 no matches, 2 usage/path error.
Match types
| Type | Behavior |
|---|---|
EXACT |
Case-insensitive full string equality |
CONTAINS |
Substring match |
PARTIAL |
Prefix or suffix match |
FUZZY |
Character-level similarity (rapidfuzz) |
PHONETIC |
Sounds-alike (metaphone, soundex, NYSIIS with graduated scoring) |
SEMANTIC |
Meaning similarity (optional; see below) |
COMPOSITE |
Weighted blend of available atomic matchers (default smart mode) |
Default composite weights: fuzzy 35%, semantic 20%, partial 15%, phonetic 12%, contains 10%, exact 8%. When semantic is disabled, composite skips it and renormalizes the remaining weights. Override per field via composite_weights on Q.composite(...) or FieldConfig.
Semantic matching (optional)
Semantic search is off by default. Opt in when you need meaning-based similarity:
export SRXY_SEMANTIC=1
pip install 'srxy[semantic]' # or: pipx install 'srxy[semantic]'
With SRXY_SEMANTIC=1, composite matching includes semantic similarity. Explicit Q.semantic(...) or MatchType.SEMANTIC raises a clear error if semantic is not enabled.
Default model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 (downloaded from Hugging Face on first use). For a local cache:
./scripts/download_semantic_model.sh
export SRXY_SEMANTIC_MODEL_PATH=~/.cache/srxy/semantic-model
Core dependencies (always installed): rapidfuzz and jellyfish (phonetic matching).
Development
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,semantic]"
./scripts/quality/checks.sh --fix
./scripts/quality/checks.sh
Quality gate: Ruff → ShellCheck/shfmt → basedpyright → pip-audit → build → pytest.
- Local (
./scripts/quality/checks.sh): runs all tests (unit + integration). - CI: runs only
pytest -m unit(fast tests; no semantic model required).
Integration tests (requires pip install -e ".[semantic]" and SRXY_SEMANTIC=1, set automatically in tests/integration/conftest.py):
pytest -m integration
Integration tests load a curated news-style corpus from tests/fixtures/search_corpus.json and measure top-k hit rates.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file srxy-1.1.0.tar.gz.
File metadata
- Download URL: srxy-1.1.0.tar.gz
- Upload date:
- Size: 21.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0dec22e05a23c9090b0e7547f42fe5029730d09ffee673e8e0db09593cfb76bd
|
|
| MD5 |
3f0d397d8d41d514dc602db97d1d5ddc
|
|
| BLAKE2b-256 |
4431ea423af294133ec75b9415e28091aa938c2a60305a92d41d452508e25957
|
File details
Details for the file srxy-1.1.0-py3-none-any.whl.
File metadata
- Download URL: srxy-1.1.0-py3-none-any.whl
- Upload date:
- Size: 22.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3b443bcb3c6ead5092815e46bd6666aa256a18a74f831d6b90e729f245c8586
|
|
| MD5 |
70fe161966ede254fd1cd5a036d41db7
|
|
| BLAKE2b-256 |
412b9f18a960d06c5974283f5e6eee6a28a7fdf1bc44d596052a1ecd67a74e93
|