Skip to main content

Keyword, vector hybrid search with semantic understanding

Project description

kv-search

Keyword, vector, and semantic hybrid search library with a pluggable backend model.

What it does

kv-search provides a SearchEngine that combines three complementary search strategies over a corpus of documents (identified by path strings):

  • Keyword search — exact/BM25-style matching via a KeywordSearchBackend
  • Vector search — embedding similarity via a VectorSearchBackend, optionally constrained to an allowlist of paths from prior keyword results
  • Semantic search — LLM-assisted reranking/reasoning via a SemanticSearchBackend, which receives all configured backends and an LLM callable to do whatever hybrid logic it needs

Backends are optional and pluggable — pass only what you have. The engine raises RuntimeError if you call a search method without a configured backend.

Each search operation runs against a SearchSession, which accumulates and deduplicates hits across calls. When keyword results exist in a session, vector search automatically narrows its scope to those paths (and tightens the minimum score threshold).

Installation

pip install kv-search                        # core only
pip install "kv-search[elasticsearch]"       # with Elasticsearch keyword backend

Usage

import asyncio
from kv_search import SearchEngine, ElasticsearchKeywordBackend

keyword_backend = ElasticsearchKeywordBackend(
    hosts="http://localhost:9200",
    index="my-docs",
)

engine = SearchEngine(keyword_backend=keyword_backend)
session = engine.new_session()

hits = asyncio.run(engine.keyword_search(session, ["transformer attention"]))
# hits: list[SearchHit] with .path and .score

Hybrid keyword + vector

from kv_search import SearchEngine

engine = SearchEngine(
    keyword_backend=my_keyword_backend,
    vector_backend=my_vector_backend,
)
session = engine.new_session()

# keyword results populate the session allowlist
await engine.keyword_search(session, ["attention mechanism"])
# vector search is automatically scoped to those paths
hits = await engine.vector_search(session, "how does self-attention work?")

Semantic search with an LLM

engine = SearchEngine(
    keyword_backend=my_keyword_backend,
    vector_backend=my_vector_backend,
    semantic_backend=my_semantic_backend,
    llm=my_llm_fn,  # async (messages, *, system) -> str
)
session = engine.new_session()
results = await engine.semantic_search(session, "explain positional encoding")
# results: list[SemanticResult] with .path, .score, .reasoning

Implementing custom backends

Subclass the relevant ABC from kv_search:

from kv_search import KeywordSearchBackend, SearchHit

class MyKeywordBackend(KeywordSearchBackend):
    async def keyword_search(self, queries: list[str]) -> list[SearchHit]:
        ...

Same pattern for VectorSearchBackend and SemanticSearchBackend. The LLMCompletionFn is a Protocol — any async callable with the right signature works.

Requirements

Python 3.11–3.14. The elasticsearch extra requires pydantic >= 2.12.5.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kv_search-0.2.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kv_search-0.2-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file kv_search-0.2.tar.gz.

File metadata

  • Download URL: kv_search-0.2.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kv_search-0.2.tar.gz
Algorithm Hash digest
SHA256 9e04fbf7bac7b2ca9e29af225e298540cd10b1f93edb1b2d053623daee77b785
MD5 877db30bba6eddd1f1e8c3a120e29b62
BLAKE2b-256 ed7990ac6acdf89029b4dd86a3e9869e0f5ce4521c07aa6ed31991ec599eee0d

See more details on using hashes here.

Provenance

The following attestation bundles were made for kv_search-0.2.tar.gz:

Publisher: release-pypi.yml on antolu/kv-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kv_search-0.2-py3-none-any.whl.

File metadata

  • Download URL: kv_search-0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kv_search-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ed39b2ad8fbe16b50ee8c6160ecc6b2ac3db9e2d8fc567b456cf8094f3aac583
MD5 1c7d97e22f8c8f1530592d68c4945fa4
BLAKE2b-256 2d65b512639e12539ce012b6dffe9508e4162325a8733b69c626cc60d3a136ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for kv_search-0.2-py3-none-any.whl:

Publisher: release-pypi.yml on antolu/kv-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page