Skip to main content

A Model Context Protocol (MCP) server providing AI-powered Baidu search with intelligent reranking and web content extraction

Project description

Search MCP Server

License: MIT Python 3.10+

A powerful Model Context Protocol (MCP) server providing AI-enhanced Baidu search with intelligent reranking and comprehensive web content extraction capabilities.

โœจ Features

  • ๐Ÿ” Baidu Search Integration: Fast and reliable search results from Baidu
  • ๐Ÿค– AI-Powered Reranking: Uses multiple AI agents (Qwen) to intelligently rerank search results by relevance
  • ๐Ÿ“„ Web Content Extraction: Extract clean, readable text from web pages with pagination support
  • ๐ŸŽฏ Batch Processing: Extract content from multiple URLs simultaneously
  • ๐ŸŒ MCP Standard: Fully compliant with Model Context Protocol for seamless integration

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.10 or higher
  • uv (recommended) or pip
  • DashScope API key (for AI search features)

Installation

Using uv (Recommended)

# Clone the repository
git clone https://github.com/Vist233/Google-Search-Tool.git
cd search-mcp

# Install with uv
uv pip install -e .

Using pip

pip install -e .

Environment Setup

Create a .env file or set environment variables for AI features:

export DASHSCOPE_API_KEY="your-api-key-here"

๐Ÿ“– Usage

As an MCP Server

Add to your MCP client configuration (e.g., Claude Desktop):

For macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

For Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "aiwebsearcher": {
      "command": "uvx",
      "args": [
        "aiwebsearcher"
      ]
    }
  }
}

Note: API key is read from environment variable DASHSCOPE_API_KEY. Set it before running:

# macOS/Linux
export DASHSCOPE_API_KEY="your-api-key-here"

# Windows (PowerShell)
$env:DASHSCOPE_API_KEY="your-api-key-here"

Standalone Testing

# Install the package
pip install aiwebsearcher

# Set API key
export DASHSCOPE_API_KEY="your-key"

# Run the server
aiwebsearcher

๐Ÿ› ๏ธ Available Tools

1. search_baidu

Execute basic Baidu search and return structured results.

Parameters:

  • query (str): Search keyword
  • max_results (int, optional): Maximum results to return (default: 5)
  • language (str, optional): Search language (default: "zh")

Returns: JSON string with title, url, and abstract for each result.

Example:

{
  "query": "ไบบๅทฅๆ™บ่ƒฝๅ‘ๅฑ•็Žฐ็Šถ",
  "max_results": 5
}

2. AI_search_baidu

AI-enhanced search with intelligent reranking and content extraction. Takes ~3x longer but provides higher quality, ranked results with full page content.

Parameters:

  • query (str): Search keyword
  • max_results (int, optional): Initial results to fetch (default: 5, recommended 5+)
  • language (str, optional): Search language (default: "zh")

Returns: JSON string with rank, title, url, and Content (full page text) for each result.

Example:

{
  "query": "AIๅ‘ๅฑ•่ถ‹ๅŠฟ 2025",
  "max_results": 12
}

3. extractTextFromUrl

Extract clean, readable text from a single webpage.

Parameters:

  • url (str): Target webpage URL
  • follow_pagination (bool, optional): Follow rel="next" links (default: true)
  • pagination_limit (int, optional): Max pagination depth (default: 3)
  • timeout (float, optional): HTTP timeout in seconds (default: 10.0)
  • user_agent (str, optional): Custom User-Agent header
  • regular_expressions (list[str], optional): Regex patterns to filter text

Returns: Extracted text content as string.

4. extractTextFromUrls

Extract text from multiple webpages in batch.

Parameters: Same as extractTextFromUrl, plus:

  • urls (list[str]): List of target URLs

Returns: Combined text from all URLs, separated by double newlines.

๐Ÿ—๏ธ Project Structure

search-mcp/
โ”œโ”€โ”€ searcher/
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ server.py              # MCP server entry point
โ”‚       โ”œโ”€โ”€ FetchPage/
โ”‚       โ”‚   โ””โ”€โ”€ fetchWeb.py        # Web content extraction
โ”‚       โ”œโ”€โ”€ WebSearch/
โ”‚       โ”‚   โ”œโ”€โ”€ baiduSearchTool.py # Baidu search implementation
โ”‚       โ”‚   โ””โ”€โ”€ SearchAgent.py     # AI agent definitions (legacy)
โ”‚       โ””โ”€โ”€ useAI2Search/
โ”‚           โ””โ”€โ”€ SearchAgent.py     # AI-powered search orchestration
โ”œโ”€โ”€ tests/                         # Test files
โ”œโ”€โ”€ pyproject.toml                # Project configuration
โ”œโ”€โ”€ requirements.txt              # Dependencies
โ””โ”€โ”€ README.md                     # This file

๐Ÿ”ง Development

Install Development Dependencies

uv pip install -e ".[dev]"

Run Tests

pytest

Code Formatting

# Format with black
black searcher/

# Lint with ruff
ruff check searcher/

๐Ÿ“ Configuration

MCP Client Configuration Examples

Minimal configuration:

{
  "mcpServers": {
    "search": {
      "command": "python",
      "args": ["server.py"],
      "cwd": "/path/to/search-mcp/searcher/src"
    }
  }
}

With uv for dependency isolation:

{
  "mcpServers": {
    "search": {
      "command": "uv",
      "args": ["--directory", "/path/to/search-mcp/searcher/src", "run", "python", "server.py"]
    }
  }
}

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ“ฎ Contact

โš ๏ธ Disclaimer

This tool is for educational and research purposes. Please respect website terms of service and rate limits when scraping content.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiwebsearcher-0.1.2.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiwebsearcher-0.1.2-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file aiwebsearcher-0.1.2.tar.gz.

File metadata

  • Download URL: aiwebsearcher-0.1.2.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for aiwebsearcher-0.1.2.tar.gz
Algorithm Hash digest
SHA256 286832c9d1ac7ddf19aab72aba98168ac1033f76cbb6ceaeae94034e7c19e575
MD5 58216271d66bf836a94f6d712f455ef9
BLAKE2b-256 b679750f9ab20853195939384cdb055285975021d18f76a44984641bb2579130

See more details on using hashes here.

File details

Details for the file aiwebsearcher-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for aiwebsearcher-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0ee0532b94192a2c887ba55d7b7a76001f7c49c180a2817afbba7967b2238dfe
MD5 46910cb54e6f58faf4035a3a2120a64b
BLAKE2b-256 748aeba83ff9e397e2c9ad4c0bf45cdf3ca9d113e387e0d2f17ae78a3955c5a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page