Fast LLM-free search-to-context engine for AI agents
Project description
llm-context-search
Turn a search query into ranked, token-bounded context for your LLM - no LLM required for retrieval, MCP server included.
pip install llm-context-search
docker compose up -d # start SearXNG
llm-context build "your query"
Quick example
Search for Python asyncio best practices, fetch the top 5 pages, extract main content, rank the most relevant passages, and pack them into a 4 000-token context block. One command.
llm-context build "Python asyncio best practices" \
--max-sources 5 --budget 4000 -o context.md
Searching… done 10 results → 5 unique
Fetching… done 5/5 pages fetched
Extracting… done 5 ok, 0 failed
Ranking… done 38 passages → 11 selected
Context saved to context.md - ~3 912 tokens, 4 831 ms
The output is Markdown, grouped by source with relevance scores - ready to paste into any LLM prompt.
Or call it from your agent as an MCP tool:
{
"tool": "build_context",
"arguments": {
"query": "Python asyncio best practices",
"budget_tokens": 4000
}
}
Why llm-context-search?
- No LLM needed for retrieval. The full search → fetch → extract → rank → pack pipeline is pure Python. No API keys, no inference costs, no rate limits during context building.
- MCP server out of the box. Drop the
llm-context-mcpcommand into Cursor or Claude Desktop and your agent can callbuild_context,search, andcollect_sourcesdirectly - without writing any integration code. - Token budget, not page count. The packer selects the highest-scoring passages that fit within your token budget, so the context is always the right size for your model.
- Self-hosted, privacy-first. Connects to your own SearXNG instance. Queries and page content never leave your infrastructure.
- Every stage is swappable. Provider, fetcher, extractor, chunker, ranker,
scorer and packer are all
Protocols. Pass your own implementation toContextSearchEngineand the rest of the pipeline keeps working.
Install
# from PyPI
pip install llm-context-search
# with uv
uv add llm-context-search
# for development, from source
git clone https://github.com/rorlikowski/llm-context-search && cd llm-context-search
uv sync --extra dev
Requires Python 3.11+.
The 60-second tour
CLI
# Search only – no page fetching
llm-context search "Python GIL" --max-results 10
# Fetch pages, show extraction status per source
llm-context collect "Python GIL" --max-sources 5 --verbose
# Full pipeline – print context to stdout
llm-context build "Python GIL" --budget 4000
# Save context to file, output JSON stats
llm-context build "Python GIL" --budget 4000 -o context.md --json
MCP server (Cursor / Claude Desktop)
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"llm-context-search": {
"command": "llm-context-mcp",
"env": { "SEARXNG_URL": "http://localhost:8888" }
}
}
}
Your agent now has three tools: search, collect_sources, build_context.
For remote deployments, run the HTTP transport instead:
LCS_MCP_TRANSPORT=http FASTMCP_PORT=9000 llm-context-mcp
Pipeline
Every query goes through the same stages:
query
→ SearXNGProvider search n results
→ URL normalisation lowercase, strip tracking params, remove default ports
→ deduplication drop exact and near-duplicate URLs
→ PageFetcher async concurrent fetch with SSRF protection
→ TrafilaturaExtractor extract main article text (BS4 fallback)
→ SourceQualityScorer heuristic score per source (https, length, title match…)
→ ParagraphChunker split into overlapping passage windows
→ LexicalRanker score by query-term coverage + source quality
→ MarkdownPacker select top-N passages within token budget
→ ContextBundle context_text + stats
All stages run in a single asyncio event loop; page fetching is concurrent
(bounded by fetch_concurrency). CPU-bound extraction runs in a thread-pool
executor to keep the loop free.
Replace any component
Every stage sits behind a Protocol. Pass your own implementation and the rest
of the pipeline adapts automatically:
from llm_context_search import ContextSearchEngine
from llm_context_search.models import SearchResult, Passage, SourceDocument
class MyProvider:
name = "brave"
async def search(
self, query: str, *, language: str = "en", max_results: int = 10
) -> list[SearchResult]:
... # call Brave Search API
class MyRanker:
def rank(
self, query: str, passages: list[Passage], sources: dict[str, SourceDocument]
) -> list[Passage]:
... # BM25, bi-encoder embeddings, cross-encoder reranker, etc.
engine = ContextSearchEngine(provider=MyProvider(), ranker=MyRanker())
| Protocol | Default | Swap to add |
|---|---|---|
SearchProvider |
SearXNGProvider |
Brave, DuckDuckGo, Tavily, … |
PageFetcherProtocol |
PageFetcher |
Playwright, Firecrawl, cache layer, … |
ContentExtractor |
TrafilaturaExtractor |
Jina Reader, custom parser, … |
PassageChunker |
ParagraphChunker |
Sentence splitter, semantic chunker, … |
PassageRanker |
LexicalRanker |
BM25, embeddings, cross-encoder, … |
SourceScorer |
SourceQualityScorer |
Domain allow-list, freshness score, … |
ContextPacker |
MarkdownPacker |
XML, JSON, custom template, … |
Documentation
| Section | What's inside |
|---|---|
| Installation | Docker + SearXNG setup, verification |
| Quickstart | All CLI commands, flags and examples |
| MCP Server | Cursor, Claude Desktop, HTTP transport, all tools |
| Python SDK | Engine API, data models, custom components |
| Configuration | Every config option and environment variable |
| API Reference | Auto-generated from docstrings |
The full documentation is published at rorlikowski.github.io/llm-context-search.
Browse locally with uv run mkdocs serve.
Development
uv sync --extra dev
uv run ruff check src/ tests/ # lint
uv run ruff format src/ tests/ # format
uv run mypy src # type-check
uv run pytest # test
Publishing a new release
- Bump
versioninpyproject.tomlandsrc/llm_context_search/__init__.py. - Commit and tag:
git tag v0.2.0 && git push --tags. - The
releaseworkflow builds, creates a GitHub Release, and publishes to PyPI via Trusted Publishing - no API tokens needed.
License
MIT - see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_context_search-0.2.0.tar.gz.
File metadata
- Download URL: llm_context_search-0.2.0.tar.gz
- Upload date:
- Size: 868.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef23ff3b2c7a3882809076c3a6c211c3a9bb77570f41b15c58c3c1d0635538a6
|
|
| MD5 |
7df1b4e0f08f14a2e0ff7c971c73da81
|
|
| BLAKE2b-256 |
de00b9e81e6f5f9b078e085f1b511fb6b21e6292e810def065c82cb532a06814
|
Provenance
The following attestation bundles were made for llm_context_search-0.2.0.tar.gz:
Publisher:
release.yml on rorlikowski/llm-context-search
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_context_search-0.2.0.tar.gz -
Subject digest:
ef23ff3b2c7a3882809076c3a6c211c3a9bb77570f41b15c58c3c1d0635538a6 - Sigstore transparency entry: 1842220216
- Sigstore integration time:
-
Permalink:
rorlikowski/llm-context-search@bde0550ef3ab0a2e211fbcfd557e63adffc048ed -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/rorlikowski
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@bde0550ef3ab0a2e211fbcfd557e63adffc048ed -
Trigger Event:
push
-
Statement type:
File details
Details for the file llm_context_search-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llm_context_search-0.2.0-py3-none-any.whl
- Upload date:
- Size: 33.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6d8b9cbed183e1fcda6e4393db52fb8a1bec44c38837b89c5d88efbfeca29b5
|
|
| MD5 |
3da19df98499739ae3ec296951b95edb
|
|
| BLAKE2b-256 |
25b2b8303f6bb890a4ae5946c7266bf1fefdfb9dd29c83a961421aad4c1cc9fe
|
Provenance
The following attestation bundles were made for llm_context_search-0.2.0-py3-none-any.whl:
Publisher:
release.yml on rorlikowski/llm-context-search
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_context_search-0.2.0-py3-none-any.whl -
Subject digest:
f6d8b9cbed183e1fcda6e4393db52fb8a1bec44c38837b89c5d88efbfeca29b5 - Sigstore transparency entry: 1842220322
- Sigstore integration time:
-
Permalink:
rorlikowski/llm-context-search@bde0550ef3ab0a2e211fbcfd557e63adffc048ed -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/rorlikowski
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@bde0550ef3ab0a2e211fbcfd557e63adffc048ed -
Trigger Event:
push
-
Statement type: