Keyword, vector hybrid search with semantic understanding
Project description
kv-search
Keyword, vector, and semantic hybrid search library with a pluggable backend model.
What it does
kv-search provides a SearchEngine that combines three complementary search strategies over a corpus of documents (identified by path strings):
- Keyword search — exact/BM25-style matching via a
KeywordSearchBackend - Vector search — embedding similarity via a
VectorSearchBackend, optionally constrained to an allowlist of paths from prior keyword results - Semantic search — LLM-assisted reranking/reasoning via a
SemanticSearchBackend, which receives all configured backends and an LLM callable to do whatever hybrid logic it needs
Backends are optional and pluggable — pass only what you have. The engine raises RuntimeError if you call a search method without a configured backend.
Each search operation runs against a SearchSession, which accumulates and deduplicates hits across calls. When keyword results exist in a session, vector search automatically narrows its scope to those paths (and tightens the minimum score threshold).
Installation
pip install kv-search # core only
pip install "kv-search[elasticsearch]" # with Elasticsearch keyword backend
Usage
import asyncio
from kv_search import SearchEngine, ElasticsearchKeywordBackend
keyword_backend = ElasticsearchKeywordBackend(
hosts="http://localhost:9200",
index="my-docs",
)
engine = SearchEngine(keyword_backend=keyword_backend)
session = engine.new_session()
hits = asyncio.run(engine.keyword_search(session, ["transformer attention"]))
# hits: list[SearchHit] with .path and .score
Hybrid keyword + vector
from kv_search import SearchEngine
engine = SearchEngine(
keyword_backend=my_keyword_backend,
vector_backend=my_vector_backend,
)
session = engine.new_session()
# keyword results populate the session allowlist
await engine.keyword_search(session, ["attention mechanism"])
# vector search is automatically scoped to those paths
hits = await engine.vector_search(session, "how does self-attention work?")
Semantic search with an LLM
engine = SearchEngine(
keyword_backend=my_keyword_backend,
vector_backend=my_vector_backend,
semantic_backend=my_semantic_backend,
llm=my_llm_fn, # async (messages, *, system) -> str
)
session = engine.new_session()
results = await engine.semantic_search(session, "explain positional encoding")
# results: list[SemanticResult] with .path, .score, .reasoning
Implementing custom backends
Subclass the relevant ABC from kv_search:
from kv_search import KeywordSearchBackend, SearchHit
class MyKeywordBackend(KeywordSearchBackend):
async def keyword_search(self, queries: list[str]) -> list[SearchHit]:
...
Same pattern for VectorSearchBackend and SemanticSearchBackend. The LLMCompletionFn is a Protocol — any async callable with the right signature works.
Requirements
Python 3.11–3.14. The elasticsearch extra requires pydantic >= 2.12.5.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kv_search-0.2.tar.gz.
File metadata
- Download URL: kv_search-0.2.tar.gz
- Upload date:
- Size: 30.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e04fbf7bac7b2ca9e29af225e298540cd10b1f93edb1b2d053623daee77b785
|
|
| MD5 |
877db30bba6eddd1f1e8c3a120e29b62
|
|
| BLAKE2b-256 |
ed7990ac6acdf89029b4dd86a3e9869e0f5ce4521c07aa6ed31991ec599eee0d
|
Provenance
The following attestation bundles were made for kv_search-0.2.tar.gz:
Publisher:
release-pypi.yml on antolu/kv-search
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kv_search-0.2.tar.gz -
Subject digest:
9e04fbf7bac7b2ca9e29af225e298540cd10b1f93edb1b2d053623daee77b785 - Sigstore transparency entry: 1450423584
- Sigstore integration time:
-
Permalink:
antolu/kv-search@6a28223618eb1675f06164b100b46014e5c4ffc1 -
Branch / Tag:
refs/tags/v0.2 - Owner: https://github.com/antolu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@6a28223618eb1675f06164b100b46014e5c4ffc1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kv_search-0.2-py3-none-any.whl.
File metadata
- Download URL: kv_search-0.2-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed39b2ad8fbe16b50ee8c6160ecc6b2ac3db9e2d8fc567b456cf8094f3aac583
|
|
| MD5 |
1c7d97e22f8c8f1530592d68c4945fa4
|
|
| BLAKE2b-256 |
2d65b512639e12539ce012b6dffe9508e4162325a8733b69c626cc60d3a136ae
|
Provenance
The following attestation bundles were made for kv_search-0.2-py3-none-any.whl:
Publisher:
release-pypi.yml on antolu/kv-search
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kv_search-0.2-py3-none-any.whl -
Subject digest:
ed39b2ad8fbe16b50ee8c6160ecc6b2ac3db9e2d8fc567b456cf8094f3aac583 - Sigstore transparency entry: 1450423661
- Sigstore integration time:
-
Permalink:
antolu/kv-search@6a28223618eb1675f06164b100b46014e5c4ffc1 -
Branch / Tag:
refs/tags/v0.2 - Owner: https://github.com/antolu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@6a28223618eb1675f06164b100b46014e5c4ffc1 -
Trigger Event:
push
-
Statement type: