MCP Web Search service for AI ecosystem
Project description
MCP Web Search
MCP service for web search and content extraction, implemented via Model Context Protocol (FastMCP).
Features
Three MCP tools:
search— web search with smart filtering and fallback chaincontent— clean text extraction from URLs with SSRF protectionwebfetch— agent-based search via LangGraph StateGraph + LLM-as-Judgellm_health— LLM model health status in failover chain
Architecture
FastMCP (primary server)
├── search tool → DuckDuckGo + fallback chain + smart filtering
├── content tool → Trafilatura + SSRF protection + cache
└── webfetch tool → LangGraph StateGraph (8 nodes) + LLM-as-Judge
Installation
# Clone the repository
git clone https://github.com/M0M0S/mcp-webs.git
cd mcp-webs
# Install dependencies
uv sync
# Configure environment variables
cp .env.example .env
# fill .env (LLM_API_KEY, LLM_BASE_URL, etc.)
Usage
Start MCP Server
uv run python -m app.main
Connect to Claude Desktop (example)
Add to claude_desktop_config.json:
{
"mcpServers": {
"web-search": {
"command": "uv",
"args": ["run", "python", "-m", "app.main"],
"env": {
"LLM_API_KEY": "your-key",
"LLM_BASE_URL": "https://api.openai.com/v1"
}
}
}
}
MCP Tools
| Tool | Description | Parameters |
|---|---|---|
search |
Web search with fallback chain | query, max_results, provider |
content |
Extract text content from URL | url, token_limit |
webfetch |
Agent-based search via LangGraph | query, max_concurrent |
Development
Project Standards
- CONTRIBUTING.md — how to contribute, process, standards
- SECURITY.md — security policy, SSRF, secret handling
- docs/standards/ — detailed standards reference
Commands
# Tests
uv run pytest tests/ -v
# Coverage
uv run pytest tests/ --cov=app --cov-report=term-missing
# Linting
uv run ruff check app/ tests/
# Formatting
uv run ruff format app/ tests/
# Type checking
uv run mypy app/
# Security scan
uv run bandit -r app/
Configuration
Environment variables documented in docs/standards/configuration.md.
Search Logic
search — search with fallback chain:
- Caching (Redis cache-aside)
- DuckDuckGo → SearxNG → Tavily → Google (fallback chain)
- Smart filtering (SEO spam, clickbait, blacklist)
- Result caching
content — content extraction:
- SSRF protection (whitelist + private IP check)
- Trafilatura → readability-lxml → bs4 (fallback chain)
- HTML sanitization (bleach)
- Caching (TTL: 24h)
webfetch — agent-based search:
- Stage 1: Generate queries via LLM
- Stage 2: Parallel searches (6 concurrent)
- Stage 3: Select URLs for extraction
- Stage 4: Judge URLs (LLM-as-Judge, threshold ≥0.85)
- Stage 5: Fetch content (Trafilatura)
- Stage 6: Generate features (Pydantic models)
- Stage 7: Judge Features (threshold ≥0.92)
- Fallback: Simple search on agent failure
Prometheus Metrics
Implemented metrics (via app/core/metrics.py):
| Metric | Type | Description |
|---|---|---|
provider_search_total |
Counter | Search attempts per provider |
provider_search_failure_total |
Counter | Failed searches per provider |
provider_health_score |
Gauge | Provider health (0.0–1.0) |
provider_chain_position |
Gauge | Provider position in fallback chain |
llm_failover_total |
Counter | LLM failover events (from→to model) |
llm_failover_duration_seconds |
Histogram | Failover duration |
llm_model_health_score |
Gauge | LLM model health (0.0–1.0) |
llm_active_model_index |
Gauge | Active LLM model index |
webfetch_checkpoint_save_total |
Counter | WebFetch checkpoint saves |
webfetch_checkpoint_resume_total |
Counter | WebFetch checkpoint resumes |
webfetch_checkpoint_size_bytes |
Histogram | Checkpoint payload size |
webfetch_active_checkpoints |
Gauge | Active checkpoints per tenant |
cache_ttl_distribution_seconds |
Histogram | Cache TTL distribution |
cache_stale_invalidations_total |
Counter | Cache stale invalidations |
cache_freshness_avg |
Gauge | Average cache freshness |
knowledge_graph_concepts_count |
Gauge | KG concepts count |
knowledge_graph_terms_count |
Gauge | KG related terms count |
kg_expansion_applied_total |
Counter | KG expansion events |
kg_enriched_concepts_total |
Counter | KG enriched concepts |
See Also
- CHANGELOG.md — version history
- pyproject.toml — dependencies and configuration
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mcp_webs-1.0.3.tar.gz
(334.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
mcp_webs-1.0.3-py3-none-any.whl
(70.4 kB
view details)
File details
Details for the file mcp_webs-1.0.3.tar.gz.
File metadata
- Download URL: mcp_webs-1.0.3.tar.gz
- Upload date:
- Size: 334.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2756fca0306f279b9b0c40caf7a94a8e2998c2394487d5ae443664f042a8dcb0
|
|
| MD5 |
587560d144efac1af3e9cc5c1cdb6c06
|
|
| BLAKE2b-256 |
42ecf4177e288c44a13dab03f091b586a67f21a7046a17c2f62d904749c0a50c
|
File details
Details for the file mcp_webs-1.0.3-py3-none-any.whl.
File metadata
- Download URL: mcp_webs-1.0.3-py3-none-any.whl
- Upload date:
- Size: 70.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fccf5a699a5714185bc9d024a964dcfecda6ccbdeec1d329d1635308e82d3157
|
|
| MD5 |
7ff411d4ca5193a06de8c9da80bc6081
|
|
| BLAKE2b-256 |
923725558935be1d6d84cacb446a0db69d392c95dfd1e610185cb5a1aabc2d15
|