Semantic (hybrid dense + sparse) search backend plugin for fmql.
Project description
fmql-semantic
Hybrid semantic search backend plugin for fmql.
- Dense retrieval via LiteLLM embeddings +
sqlite-vec. - Sparse retrieval via SQLite FTS5 (BM25).
- Fusion via reciprocal rank fusion (RRF).
- Optional reranking via LiteLLM rerank providers (Cohere, Voyage, etc.).
- Single-file SQLite index. No server.
Install
pip install fmql-semantic
fmql-semantic requires a Python build with sqlite3 loadable-extension support
(macOS/Linux/Windows builds from python.org, pyenv, uv, and official Docker
images all qualify; the macOS system Python at /usr/bin/python3 does not). If
the extension loader is unavailable, the backend fails fast with a clear error.
Configure
fmql-semantic reads configuration from three channels, in increasing
precedence:
- Process environment.
- A dotenv file pointed to by
--option env=path/to/.env. --option KEY=VALUEflags on the command line.
Environment variables
| Variable | Purpose |
|---|---|
FMQL_EMBEDDING_MODEL |
LiteLLM embedding model string (required). |
FMQL_EMBEDDING_API_BASE |
Override provider API base URL. |
FMQL_EMBEDDING_API_KEY |
Override provider API key. |
FMQL_EMBEDDING_BATCH_SIZE |
Packets per embedding call (default 100). |
FMQL_EMBEDDING_CONCURRENCY |
Max concurrent embedding requests (default 4). |
FMQL_EMBEDDING_MAX_TOKENS |
Per-packet token budget before truncation (default 8000). |
FMQL_RERANKER_MODEL |
LiteLLM rerank model. Enables reranking when set. |
FMQL_RERANKER_TOP_N |
Candidates sent to reranker (default 50). |
Standard LiteLLM provider env vars (OPENAI_API_KEY, VOYAGE_API_KEY,
OLLAMA_API_BASE, …) are read by LiteLLM directly.
--option keys
Build: model, api_base, api_key, batch_size, concurrency,
max_tokens, fields, force, env.
Query: model, api_base, api_key, reranker_model, reranker_top_n,
rerank_required, no_rerank, dense_only, sparse_only, fetch_k, env.
Use
export FMQL_EMBEDDING_MODEL=openai/text-embedding-3-small
export OPENAI_API_KEY=...
# Build once:
fmql index ./my-notes --backend semantic
# Query:
fmql search "quarterly planning" --backend semantic --workspace ./my-notes -k 10
# Dense-only / sparse-only / disable rerank for this query:
fmql search q --backend semantic --workspace ./my-notes --option dense_only=true
fmql search q --backend semantic --workspace ./my-notes --option sparse_only=true
fmql search q --backend semantic --workspace ./my-notes --option no_rerank=true
The default index location is <workspace>/.fmql/semantic.db. Override with
--out (for fmql index) or --index (for fmql search).
Indexing
For each packet, the backend indexes:
<first present frontmatter field from --option fields=title,summary,name>
<body>
Frontmatter field values are otherwise not indexed — they're already queryable via fmql's structured layer.
Builds are incremental: packets whose content hash hasn't changed since the last build are skipped. Packets removed from the workspace are removed from the index. The index is committed per batch via SQLite WAL, so a crashed build leaves a queryable index that the next run picks up.
Model pinning
An index is pinned to the embedding model that built it. Rebuilding with a
different model refuses unless you pass --force (which drops the existing
tables). Dimension mismatches are caught the same way.
Provider notes
- OpenAI (
openai/text-embedding-3-small,openai/text-embedding-3-large) — batch caps at 2048; default 100 is fine. - Voyage (
voyage/voyage-3) — batch caps at 128. Set--option batch_size=128(or lower) for large indexes. - Cohere rerank (
cohere/rerank-v3.5) — works as a reranker model out of the box onceCOHERE_API_KEYis set. - Ollama (
ollama/nomic-embed-text) — setOLLAMA_API_BASEor--option api_base=http://localhost:11434.
Licensing
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fmql_semantic-0.1.0.tar.gz.
File metadata
- Download URL: fmql_semantic-0.1.0.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5908fc648fb21bf16171dd32458ab551a112a1bad7dae4ae7745a3f6e1a87cb
|
|
| MD5 |
875a89e3ae065bf86ab2686a0b64bc7d
|
|
| BLAKE2b-256 |
ccaf18adcc78a304839768faff9c9dce3323fc5368219f6b02bb7f8ad831a4fa
|
Provenance
The following attestation bundles were made for fmql_semantic-0.1.0.tar.gz:
Publisher:
publish-semantic.yml on buyuk-dev/fmql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fmql_semantic-0.1.0.tar.gz -
Subject digest:
f5908fc648fb21bf16171dd32458ab551a112a1bad7dae4ae7745a3f6e1a87cb - Sigstore transparency entry: 1329901918
- Sigstore integration time:
-
Permalink:
buyuk-dev/fmql@0f0ccaebda52bd3e69ddeb75cbd9d20318c49b2c -
Branch / Tag:
refs/tags/semantic-v0.1.0 - Owner: https://github.com/buyuk-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-semantic.yml@0f0ccaebda52bd3e69ddeb75cbd9d20318c49b2c -
Trigger Event:
push
-
Statement type:
File details
Details for the file fmql_semantic-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fmql_semantic-0.1.0-py3-none-any.whl
- Upload date:
- Size: 19.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4faec01beed77be09e40974478aaf0069887f15e7938f9cb709c2d19bea4840d
|
|
| MD5 |
7f064696115d03ba163c471395c8d7bd
|
|
| BLAKE2b-256 |
cc07460ed4a5e7be4e44d97241828d04c9e79cb69d013de2a03895fdeaf35cfb
|
Provenance
The following attestation bundles were made for fmql_semantic-0.1.0-py3-none-any.whl:
Publisher:
publish-semantic.yml on buyuk-dev/fmql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fmql_semantic-0.1.0-py3-none-any.whl -
Subject digest:
4faec01beed77be09e40974478aaf0069887f15e7938f9cb709c2d19bea4840d - Sigstore transparency entry: 1329901996
- Sigstore integration time:
-
Permalink:
buyuk-dev/fmql@0f0ccaebda52bd3e69ddeb75cbd9d20318c49b2c -
Branch / Tag:
refs/tags/semantic-v0.1.0 - Owner: https://github.com/buyuk-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-semantic.yml@0f0ccaebda52bd3e69ddeb75cbd9d20318c49b2c -
Trigger Event:
push
-
Statement type: