Skip to main content

Keyword, vector hybrid search with semantic understanding

Project description

kv-search

Keyword, vector, and semantic hybrid search library with a pluggable backend model.

What it does

kv-search provides a SearchEngine that combines three complementary search strategies over a corpus of documents (identified by path strings):

  • Keyword search — exact/BM25-style matching via a KeywordSearchBackend
  • Vector search — embedding similarity via a VectorSearchBackend, optionally constrained to an allowlist of paths from prior keyword results
  • Semantic search — LLM-assisted reranking/reasoning via a SemanticSearchBackend, which receives all configured backends and an LLM callable to do whatever hybrid logic it needs

Backends are optional and pluggable — pass only what you have. The engine raises RuntimeError if you call a search method without a configured backend.

Each search operation runs against a SearchSession, which accumulates and deduplicates hits across calls. When keyword results exist in a session, vector search automatically narrows its scope to those paths (and tightens the minimum score threshold).

Installation

pip install kv-search                        # core only
pip install "kv-search[elasticsearch]"       # with Elasticsearch keyword backend

Usage

import asyncio
from kv_search import SearchEngine, ElasticsearchKeywordBackend

keyword_backend = ElasticsearchKeywordBackend(
    hosts="http://localhost:9200",
    index="my-docs",
)

engine = SearchEngine(keyword_backend=keyword_backend)
session = engine.new_session()

hits = asyncio.run(engine.keyword_search(session, ["transformer attention"]))
# hits: list[SearchHit] with .path and .score

Hybrid keyword + vector

from kv_search import SearchEngine

engine = SearchEngine(
    keyword_backend=my_keyword_backend,
    vector_backend=my_vector_backend,
)
session = engine.new_session()

# keyword results populate the session allowlist
await engine.keyword_search(session, ["attention mechanism"])
# vector search is automatically scoped to those paths
hits = await engine.vector_search(session, "how does self-attention work?")

Semantic search with an LLM

engine = SearchEngine(
    keyword_backend=my_keyword_backend,
    vector_backend=my_vector_backend,
    semantic_backend=my_semantic_backend,
    llm=my_llm_fn,  # async (messages, *, system) -> str
)
session = engine.new_session()
results = await engine.semantic_search(session, "explain positional encoding")
# results: list[SemanticResult] with .path, .score, .reasoning

Implementing custom backends

Subclass the relevant ABC from kv_search:

from kv_search import KeywordSearchBackend, SearchHit

class MyKeywordBackend(KeywordSearchBackend):
    async def keyword_search(self, queries: list[str]) -> list[SearchHit]:
        ...

Same pattern for VectorSearchBackend and SemanticSearchBackend. The LLMCompletionFn is a Protocol — any async callable with the right signature works.

Requirements

Python 3.11–3.14. The elasticsearch extra requires pydantic >= 2.12.5.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kv_search-0.3.tar.gz (30.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kv_search-0.3-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file kv_search-0.3.tar.gz.

File metadata

  • Download URL: kv_search-0.3.tar.gz
  • Upload date:
  • Size: 30.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kv_search-0.3.tar.gz
Algorithm Hash digest
SHA256 216651733fb06c2839ad32f981be0334e9a8ba2006b32acc612f3ed2d90b2452
MD5 647ee41d54eb46efb95ead56d976c600
BLAKE2b-256 d822e8edb4623182c37ab30e80aceb6755159d4839af9e28d02158e5140586ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for kv_search-0.3.tar.gz:

Publisher: release-pypi.yml on antolu/kv-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kv_search-0.3-py3-none-any.whl.

File metadata

  • Download URL: kv_search-0.3-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kv_search-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a250c2b365495461b179099b5dad9fb6e01f64cab1b832efb3ccb2b2d607b388
MD5 ea9f93735d420827b7c21f7d7d7d3163
BLAKE2b-256 870d538803d73e3bee8bb3aa900537160a6864c73b5e1ada43d816818d9198b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for kv_search-0.3-py3-none-any.whl:

Publisher: release-pypi.yml on antolu/kv-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page