Free, open-source web search MCP server for AI coding tools
Project description
Searchlight
Free, open-source web search MCP server for AI coding tools.
Works with Claude Code, Cursor, Windsurf, VS Code Copilot, and any MCP-compatible AI tool. Zero API keys required — install, add one line to your MCP config, and start searching.
Features
- Zero cost — Free search via native HTTP scraping (Bing, Baidu, Yandex, Brave, DuckDuckGo)
- Zero config — Works out of the box with
autobackend and language-aware routing - 7 search engines with automatic failover and reachability probing
- 4 MCP tools —
web_search,web_read,web_search_and_read,search_config - Quality Site Library — Auto-enhances queries with authoritative sources (Anthropic, OpenAI, MCP docs, LangChain, etc.)
- Smart content extraction — trafilatura → readability → BeautifulSoup fallback chain with quality scoring
- JS page rendering — Automatic Jina AI proxy fallback for JavaScript-rendered pages
- Smart caching — Async SQLite with dynamic TTL (time-sensitive queries cache shorter)
- Auto-learning — Automatically discovers and adds high-quality websites from your reading patterns
- Security — Automatic API key/secret detection and redaction in queries
Installation
Option 1: PyPI (Recommended)
pip install searchlight-mcp
Or with uv:
uv pip install searchlight-mcp
Option 2: Install from GitHub
pip install git+https://github.com/McKenzieIT/smart-web-search.git
Quick Start — One-Click MCP Setup
Claude Code
Add to ~/.claude.json or project .mcp.json:
{
"mcpServers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"]
}
}
}
Or use the CLI one-liner:
claude mcp add searchlight -- python -m searchlight
Cursor
Add to .cursor/mcp.json in your project root:
{
"mcpServers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"]
}
}
}
Or global: ~/.cursor/mcp.json
VS Code Copilot
Add to .vscode/mcp.json:
{
"servers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"]
}
}
}
Windsurf
Add to .windsurf/mcp.json:
{
"mcpServers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"]
}
}
}
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"]
}
}
}
Generic MCP Client
Any MCP-compatible tool can use this configuration:
{
"mcpServers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"]
}
}
}
Restart your AI tool after adding the config. Searchlight's 4 tools are immediately available.
Agent Prompts — Teach Your AI to Search Well
After installing searchlight, paste the appropriate prompt into your agent's system prompt or custom instructions so it knows when and how to search.
Claude Code
Add to your CLAUDE.md or project .claude/instructions.md:
## Web Search Guidelines
- Use `web_search` to find documentation, error solutions, current events, and comparisons.
- Use `web_read` to extract detailed content from a specific URL.
- Use `web_search_and_read` for deep research that requires reading multiple pages.
- For quick lookups, `web_search` alone is sufficient — no need to read every result.
- Use `time_range="month"` or `"week"` for current events or recent changes.
- Use `mode="preview"` to check if a page is relevant before reading the full content.
- The Quality Site Library automatically prioritizes authoritative sources for AI/developer topics.
Cursor
Add to .cursorrules or Cursor's custom instructions:
## Web Search
When you need current information, documentation, or solutions not in your training data:
- Use `web_search(query)` to find relevant results quickly.
- Use `web_read(url)` to read a specific page's content.
- Use `web_search_and_read(query)` for comprehensive research.
- Prefer `web_search` for quick answers; use `web_search_and_read` for in-depth analysis.
- Use `time_range="month"` for recent information.
VS Code Copilot
Add to .github/copilot-instructions.md:
## Web Search
- `web_search(query)`: Find information on the web. Returns titles, URLs, snippets.
- `web_read(url)`: Read a web page's content as Markdown.
- `web_search_and_read(query)`: Search and read top results in one call.
- Use web_search as the first choice. Use web_search_and_read for research tasks.
- Use time_range parameter for current events.
Windsurf
Add to .windsurfrules:
## Web Search
- Use web_search to find docs, error solutions, and current info.
- Use web_read to extract content from specific URLs.
- Use web_search_and_read for deep research.
- The searchlight MCP server auto-boosts results from authoritative AI/developer sources.
MCP Tools
web_search
Search the web using multiple engines. Returns Markdown-formatted results.
web_search(query="Python asyncio tutorial", max_results=10, time_range="month")
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | string | required | Search terms (max 500 chars) |
| max_results | int | 10 | Number of results (1-20) |
| language | string | null | Language code (zh, en, ja, etc.) |
| time_range | string | null | "day", "week", "month", "year" |
| backend | string | null | Override backend for this search |
web_read
Read and extract clean Markdown content from a web page.
web_read(url="https://docs.python.org/3/library/asyncio.html", max_length=10000, mode="full")
| Parameter | Type | Default | Description |
|---|---|---|---|
| url | string | required | URL to read |
| max_length | int | 10000 | Maximum characters to return |
| mode | string | "full" | "preview" (headings only) or "full" |
web_search_and_read
Search + read top results in one call. Best for deep research.
web_search_and_read(query="FastAPI vs Flask comparison", max_read=2)
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | string | required | Search query |
| max_results | int | 5 | Max search results |
| max_read | int | 2 | Pages to read (auto-extends on failure) |
| max_length | int | 10000 | Max chars per page |
search_config
View and manage searchlight configuration.
search_config(action="status") # Show config, cache, QSL stats
search_config(action="health_check") # Test engine connectivity
search_config(action="clear_cache") # Clear all cached results
search_config(action="set_backend", backend="bing") # Switch default backend
Search Backends
All backends use direct HTTP scraping — no API keys needed.
| Backend | Description | Language |
|---|---|---|
| auto | Best available (default) | Auto-detect |
| bing | Bing International | English |
| bing_cn | Bing China | Chinese |
| baidu | Baidu Search | Chinese |
| yandex | Yandex Search | English |
| brave | Brave Search | English |
| duckduckgo | DuckDuckGo | English |
The auto backend automatically detects Chinese characters and routes to Baidu/Bing CN for Chinese queries, and uses Brave/DuckDuckGo/Bing for English queries. Engines are probed for reachability and failed engines are skipped.
Quality Site Library
Searchlight includes a built-in Quality Site Library that enhances search results for AI/developer topics:
- Query Enhancement — Automatically adds authoritative keywords when searching for LLM, MCP, agent, Python topics
- Result Boosting — Moves results from quality domains (official docs, research papers) higher in rankings
- Auto-Learning — Tracks websites you read and automatically adds high-quality ones to the library
Built-in categories with curated sources:
| Category | Quality Sources |
|---|---|
| LLM | OpenAI Platform, Anthropic Docs, Google AI, Hugging Face |
| MCP | modelcontextprotocol.io, Anthropic MCP Docs |
| Agents | LangChain, CrewAI, LlamaIndex |
| Anthropic | anthropic.com, docs.anthropic.com |
| Google AI | ai.google.dev, Google Cloud |
| Python | docs.python.org, PyPI, uv/Real Python |
The library is stored at ~/.searchlight/sites.json and can be manually edited.
Content Extraction Pipeline
URL → HTTP Fetch (SSL progressive degradation)
→ trafilatura (Markdown mode)
→ readability + markdownify (fallback)
→ BeautifulSoup cleanup (fallback)
→ Quality Report (5-dimension assessment)
→ Section-aware truncation
→ Cached in SQLite
Quality scores measure: text density, structure quality, noise-free, completeness, and HTML cleanliness. JavaScript-rendered pages automatically fall back to the Jina AI proxy for rendering.
Configuration
Set via MCP config env field or shell environment:
SEARCHLIGHT_BACKEND=auto # Default backend
SEARCHLIGHT_CACHE_TTL=24 # Cache TTL in hours
SEARCHLIGHT_CACHE_MAX_SIZE=100 # Max cache size in MB
SEARCHLIGHT_MAX_CONTENT=10000 # Max content length in chars
SEARCHLIGHT_TIMEOUT=15 # HTTP timeout in seconds
SEARCHLIGHT_VERBOSE=true # Enable debug logging
Example with custom backend:
{
"mcpServers": {
"searchlight": {
"command": "python",
"args": ["-m", "searchlight"],
"env": {
"SEARCHLIGHT_BACKEND": "bing"
}
}
}
}
Architecture
searchlight/
├── server.py # MCP server + 4 tools
├── sites_library.py # Quality Site Library (QSL)
├── config.py # Environment-based config
├── search/
│ ├── base.py # SearchBackend ABC + SearchResult
│ └── native.py # Native HTTP search (7 engines)
├── reader/
│ ├── fetcher.py # HTTP fetching + SSL fallback + Jina proxy
│ └── extractor.py # Content extraction + QualityReport
├── processing/
│ ├── filter.py # Dedup + spam filtering
│ └── truncator.py # Section-aware Markdown truncation
├── cache/
│ └── sqlite.py # Async SQLite with smart TTL
├── security/
│ └── sanitizer.py # Secret detection in queries
└── utils/
├── logger.py # Logging setup
└── health.py # Backend health checks
Publishing
pip install build twine
python -m build
twine upload dist/*
Development
pip install -e ".[dev]"
pytest tests/ -v
Run with verbose logging:
SEARCHLIGHT_VERBOSE=true python -m searchlight
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file searchlight_mcp-4.1.0.tar.gz.
File metadata
- Download URL: searchlight_mcp-4.1.0.tar.gz
- Upload date:
- Size: 70.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc1d4913e511ff0d4e18fa5672d0f3f058bfff1379176604b30de475e62f0779
|
|
| MD5 |
d6cc08b32286e14d9e67e2b13e36b9c5
|
|
| BLAKE2b-256 |
7ce6a0b7c279f05d0d70dade899e2f26de4e3bc97fcb42c3fb24f22d076f3f17
|
File details
Details for the file searchlight_mcp-4.1.0-py3-none-any.whl.
File metadata
- Download URL: searchlight_mcp-4.1.0-py3-none-any.whl
- Upload date:
- Size: 36.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d5e76a21021706ad32df68b3f6dcf83803f163f6bec9be6cbe8ef1e4f46cdf0
|
|
| MD5 |
1cdfd7f6a0b298b76cd8f9e749c6174e
|
|
| BLAKE2b-256 |
d71b04915591311148178da3934630abdfbb063e8d786604baec65a4a4fed416
|