Skip to main content

An on-device hybrid search engine for Markdown documents

Project description

QMD-Py — Query Markup Documents

中文文档

An on-device hybrid search engine for Markdown documents. Python port of qmd.

Combines BM25 full-text search, vector semantic search (Qwen3-Embedding-0.6B), and LLM re-ranking (Qwen3-Reranker-0.6B) — all running locally via SQLite + sqlite-vec.

Install

pip install -e .               # core
pip install -e ".[dev]"        # + dev/test deps
pip install -e ".[testing]"    # + contract test deps (rank-bm25, deepdiff)

Quick Start — Python API

from qmd import connect

client = connect("my_docs.sqlite")
col = client.collection("notes")

# Add documents
col.add_document("doc1", "# Meeting Notes\n\nDiscussed project timeline.", {"tag": "meeting"})
col.add_documents([
    {"document_id": "doc2", "markdown": "# API Design\n\nREST endpoints..."},
    {"document_id": "doc3", "markdown": "# Deployment\n\nDocker setup..."},
])

# Search
results = col.hybrid_search("project timeline", top_k=5)
for r in results:
    print(f"{r.chunk_ref.document_id}: {r.score:.3f}{r.text[:80]}")

# Search with reranking
results = col.hybrid_search("deployment", top_k=5, rerank=True)

client.close()

Quick Start — CLI

# Add a document
python -m qmd document add --collection notes --document-id doc1 --markdown-file notes.md

# List documents
python -m qmd document list --collection notes

# Search
python -m qmd search --collection notes --query "project timeline" --top-k 5

# List collections
python -m qmd collection list

Architecture

  • Storage: SQLite + sqlite-vec (single-file database)
  • Embedding: Qwen3-Embedding-0.6B (sentence-transformers, 1024-dim)
  • Reranker: Qwen3-Reranker-0.6B (transformers, yes/no softmax)
  • Query Expansion: Qwen3-0.6B (optional, configurable)
  • Fusion: BM25 + Vector → Reciprocal Rank Fusion (RRF, k=60)
  • Blending: Position-aware blending (optional, configurable weights)

Configuration

Place qmd.yaml next to your .sqlite file:

chunking:
  size: 512
  overlap: 64

embedding:
  batch_size: "auto"     # GPU=64, CPU=16

rerank:
  enabled: false
  top_k_candidates: 40

expansion:
  enabled: false         # Query expansion (Qwen3-0.6B)

retrieval:
  rrf_k: 60
  blending_mode: "pure_rerank"   # or "position_aware"

Requirements

  • Python >= 3.11
  • GPU (optional): CUDA for accelerated embedding/reranking

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qmd-0.1.2.tar.gz (27.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qmd-0.1.2-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file qmd-0.1.2.tar.gz.

File metadata

  • Download URL: qmd-0.1.2.tar.gz
  • Upload date:
  • Size: 27.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for qmd-0.1.2.tar.gz
Algorithm Hash digest
SHA256 eb25df2a3a4bd1c39cd2018c441f327b6d0a1af60d25135560de0d114b240286
MD5 7941ec4b5bcce70e1654bbdca05ee3f1
BLAKE2b-256 a3aa6ccce7a495881a366e3431273fa02b632abb8681b1ec7186d9dd52851ebe

See more details on using hashes here.

File details

Details for the file qmd-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: qmd-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for qmd-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7ace26b620a80f6504ba7509f67b90d29f4a915f8d099fa62aafbcddf8658710
MD5 b6570835909718472c9a82c3a030109c
BLAKE2b-256 bd0a011773ed1e1931c1c2aa67e266052ae951b1f2839b86a8b7c658587dde99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page