Skip to main content

Fast LLM-free search-to-context engine for AI agents

Project description

llm-context-search

llm-context-search

Turn a search query into ranked, token-bounded context for your LLM - no LLM required for retrieval, MCP server included.

CI PyPI Python License: MIT

pip install llm-context-search
docker compose up -d          # start SearXNG
llm-context build "your query"

Quick example

Search for Python asyncio best practices, fetch the top 5 pages, extract main content, rank the most relevant passages, and pack them into a 4 000-token context block. One command.

llm-context build "Python asyncio best practices" \
  --max-sources 5 --budget 4000 -o context.md
  Searching…  done   10 results → 5 unique
  Fetching…   done   5/5 pages fetched
  Extracting… done   5 ok, 0 failed
  Ranking…    done   38 passages → 11 selected

Context saved to context.md - ~3 912 tokens, 4 831 ms

The output is Markdown, grouped by source with relevance scores - ready to paste into any LLM prompt.

Or call it from your agent as an MCP tool:

{
  "tool": "build_context",
  "arguments": {
    "query": "Python asyncio best practices",
    "budget_tokens": 4000
  }
}

Why llm-context-search?

  • No LLM needed for retrieval. The full search → fetch → extract → rank → pack pipeline is pure Python. No API keys, no inference costs, no rate limits during context building.
  • MCP server out of the box. Drop the llm-context-mcp command into Cursor or Claude Desktop and your agent can call build_context, search, and collect_sources directly - without writing any integration code.
  • Token budget, not page count. The packer selects the highest-scoring passages that fit within your token budget, so the context is always the right size for your model.
  • Self-hosted, privacy-first. Connects to your own SearXNG instance. Queries and page content never leave your infrastructure.
  • Every stage is swappable. Provider, fetcher, extractor, chunker, ranker, scorer and packer are all Protocols. Pass your own implementation to ContextSearchEngine and the rest of the pipeline keeps working.

Install

# from PyPI
pip install llm-context-search

# with uv
uv add llm-context-search

# for development, from source
git clone https://github.com/rorlikowski/llm-context-search && cd llm-context-search
uv sync --extra dev

Requires Python 3.11+.


The 60-second tour

CLI

# Search only – no page fetching
llm-context search "Python GIL" --max-results 10

# Fetch pages, show extraction status per source
llm-context collect "Python GIL" --max-sources 5 --verbose

# Full pipeline – print context to stdout
llm-context build "Python GIL" --budget 4000

# Save context to file, output JSON stats
llm-context build "Python GIL" --budget 4000 -o context.md --json

MCP server (Cursor / Claude Desktop)

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "llm-context-search": {
      "command": "llm-context-mcp",
      "env": { "SEARXNG_URL": "http://localhost:8888" }
    }
  }
}

Your agent now has three tools: search, collect_sources, build_context.

For remote deployments, run the HTTP transport instead:

LCS_MCP_TRANSPORT=http FASTMCP_PORT=9000 llm-context-mcp

Pipeline

Every query goes through the same stages:

query
  → SearXNGProvider           search n results
  → URL normalisation          lowercase, strip tracking params, remove default ports
  → deduplication              drop exact and near-duplicate URLs
  → PageFetcher                async concurrent fetch with SSRF protection
  → TrafilaturaExtractor       extract main article text (BS4 fallback)
  → SourceQualityScorer        heuristic score per source (https, length, title match…)
  → ParagraphChunker           split into overlapping passage windows
  → LexicalRanker              score by query-term coverage + source quality
  → MarkdownPacker             select top-N passages within token budget
  → ContextBundle              context_text + stats

All stages run in a single asyncio event loop; page fetching is concurrent (bounded by fetch_concurrency). CPU-bound extraction runs in a thread-pool executor to keep the loop free.


Replace any component

Every stage sits behind a Protocol. Pass your own implementation and the rest of the pipeline adapts automatically:

from llm_context_search import ContextSearchEngine
from llm_context_search.models import SearchResult, Passage, SourceDocument

class MyProvider:
    name = "brave"

    async def search(
        self, query: str, *, language: str = "en", max_results: int = 10
    ) -> list[SearchResult]:
        ...  # call Brave Search API

class MyRanker:
    def rank(
        self, query: str, passages: list[Passage], sources: dict[str, SourceDocument]
    ) -> list[Passage]:
        ...  # BM25, bi-encoder embeddings, cross-encoder reranker, etc.

engine = ContextSearchEngine(provider=MyProvider(), ranker=MyRanker())
Protocol Default Swap to add
SearchProvider SearXNGProvider Brave, DuckDuckGo, Tavily, …
PageFetcherProtocol PageFetcher Playwright, Firecrawl, cache layer, …
ContentExtractor TrafilaturaExtractor Jina Reader, custom parser, …
PassageChunker ParagraphChunker Sentence splitter, semantic chunker, …
PassageRanker LexicalRanker BM25, embeddings, cross-encoder, …
SourceScorer SourceQualityScorer Domain allow-list, freshness score, …
ContextPacker MarkdownPacker XML, JSON, custom template, …

Documentation

Section What's inside
Installation Docker + SearXNG setup, verification
Quickstart All CLI commands, flags and examples
MCP Server Cursor, Claude Desktop, HTTP transport, all tools
Python SDK Engine API, data models, custom components
Configuration Every config option and environment variable
API Reference Auto-generated from docstrings

The full documentation is published at rorlikowski.github.io/llm-context-search.

Browse locally with uv run mkdocs serve.


Development

uv sync --extra dev

uv run ruff check src/ tests/      # lint
uv run ruff format src/ tests/     # format
uv run mypy src                    # type-check
uv run pytest                      # test

Publishing a new release

  1. Bump version in pyproject.toml and src/llm_context_search/__init__.py.
  2. Commit and tag: git tag v0.2.0 && git push --tags.
  3. The release workflow builds, creates a GitHub Release, and publishes to PyPI via Trusted Publishing - no API tokens needed.

License

MIT - see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_context_search-0.2.0.tar.gz (868.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_context_search-0.2.0-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file llm_context_search-0.2.0.tar.gz.

File metadata

  • Download URL: llm_context_search-0.2.0.tar.gz
  • Upload date:
  • Size: 868.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_context_search-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ef23ff3b2c7a3882809076c3a6c211c3a9bb77570f41b15c58c3c1d0635538a6
MD5 7df1b4e0f08f14a2e0ff7c971c73da81
BLAKE2b-256 de00b9e81e6f5f9b078e085f1b511fb6b21e6292e810def065c82cb532a06814

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_context_search-0.2.0.tar.gz:

Publisher: release.yml on rorlikowski/llm-context-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_context_search-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_context_search-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6d8b9cbed183e1fcda6e4393db52fb8a1bec44c38837b89c5d88efbfeca29b5
MD5 3da19df98499739ae3ec296951b95edb
BLAKE2b-256 25b2b8303f6bb890a4ae5946c7266bf1fefdfb9dd29c83a961421aad4c1cc9fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_context_search-0.2.0-py3-none-any.whl:

Publisher: release.yml on rorlikowski/llm-context-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page