Skip to main content

A Model Context Protocol (MCP) server providing AI-powered Baidu search with intelligent reranking and web content extraction

Project description

Search MCP Server

License: MIT Python 3.10+

A powerful Model Context Protocol (MCP) server providing AI-enhanced Baidu search with intelligent reranking and comprehensive web content extraction capabilities.

โœจ Features

  • ๐Ÿ” Baidu Search Integration: Fast and reliable search results from Baidu
  • ๐Ÿค– AI-Powered Reranking: Uses multiple AI agents (Qwen) to intelligently rerank search results by relevance
  • ๐Ÿ“„ Web Content Extraction: Extract clean, readable text from web pages with pagination support
  • ๐ŸŽฏ Batch Processing: Extract content from multiple URLs simultaneously
  • ๐ŸŒ MCP Standard: Fully compliant with Model Context Protocol for seamless integration

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.10 or higher
  • uv (recommended) or pip
  • DashScope API key (for AI search features)

Installation

Using uv (Recommended)

# Clone the repository
git clone https://github.com/Vist233/Google-Search-Tool.git
cd search-mcp

# Install with uv
uv pip install -e .

Using pip

pip install -e .

Environment Setup

Create a .env file or set environment variables for AI features:

export DASHSCOPE_API_KEY="your-api-key-here"

๐Ÿ“– Usage

As an MCP Server

Add to your MCP client configuration (e.g., Claude Desktop):

For macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

For Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "search-tools": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/search-mcp/searcher/src",
        "run",
        "python",
        "server.py"
      ]
    }
  }
}

Alternatively, if installed globally:

{
  "mcpServers": {
    "search-tools": {
      "command": "python",
      "args": [
        "-m",
        "searcher.src.server"
      ],
      "cwd": "/path/to/search-mcp"
    }
  }
}

Standalone Testing

cd searcher/src
python server.py

๐Ÿ› ๏ธ Available Tools

1. search_baidu

Execute basic Baidu search and return structured results.

Parameters:

  • query (str): Search keyword
  • max_results (int, optional): Maximum results to return (default: 5)
  • language (str, optional): Search language (default: "zh")

Returns: JSON string with title, url, and abstract for each result.

Example:

{
  "query": "ไบบๅทฅๆ™บ่ƒฝๅ‘ๅฑ•็Žฐ็Šถ",
  "max_results": 5
}

2. AI_search_baidu

AI-enhanced search with intelligent reranking and content extraction. Takes ~3x longer but provides higher quality, ranked results with full page content.

Parameters:

  • query (str): Search keyword
  • max_results (int, optional): Initial results to fetch (default: 5, recommended 5+)
  • language (str, optional): Search language (default: "zh")

Returns: JSON string with rank, title, url, and Content (full page text) for each result.

Example:

{
  "query": "AIๅ‘ๅฑ•่ถ‹ๅŠฟ 2025",
  "max_results": 12
}

3. extractTextFromUrl

Extract clean, readable text from a single webpage.

Parameters:

  • url (str): Target webpage URL
  • follow_pagination (bool, optional): Follow rel="next" links (default: true)
  • pagination_limit (int, optional): Max pagination depth (default: 3)
  • timeout (float, optional): HTTP timeout in seconds (default: 10.0)
  • user_agent (str, optional): Custom User-Agent header
  • regular_expressions (list[str], optional): Regex patterns to filter text

Returns: Extracted text content as string.

4. extractTextFromUrls

Extract text from multiple webpages in batch.

Parameters: Same as extractTextFromUrl, plus:

  • urls (list[str]): List of target URLs

Returns: Combined text from all URLs, separated by double newlines.

๐Ÿ—๏ธ Project Structure

search-mcp/
โ”œโ”€โ”€ searcher/
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ server.py              # MCP server entry point
โ”‚       โ”œโ”€โ”€ FetchPage/
โ”‚       โ”‚   โ””โ”€โ”€ fetchWeb.py        # Web content extraction
โ”‚       โ”œโ”€โ”€ WebSearch/
โ”‚       โ”‚   โ”œโ”€โ”€ baiduSearchTool.py # Baidu search implementation
โ”‚       โ”‚   โ””โ”€โ”€ SearchAgent.py     # AI agent definitions (legacy)
โ”‚       โ””โ”€โ”€ useAI2Search/
โ”‚           โ””โ”€โ”€ SearchAgent.py     # AI-powered search orchestration
โ”œโ”€โ”€ tests/                         # Test files
โ”œโ”€โ”€ pyproject.toml                # Project configuration
โ”œโ”€โ”€ requirements.txt              # Dependencies
โ””โ”€โ”€ README.md                     # This file

๐Ÿ”ง Development

Install Development Dependencies

uv pip install -e ".[dev]"

Run Tests

pytest

Code Formatting

# Format with black
black searcher/

# Lint with ruff
ruff check searcher/

๐Ÿ“ Configuration

MCP Client Configuration Examples

Minimal configuration:

{
  "mcpServers": {
    "search": {
      "command": "python",
      "args": ["server.py"],
      "cwd": "/path/to/search-mcp/searcher/src"
    }
  }
}

With uv for dependency isolation:

{
  "mcpServers": {
    "search": {
      "command": "uv",
      "args": ["--directory", "/path/to/search-mcp/searcher/src", "run", "python", "server.py"]
    }
  }
}

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ“ฎ Contact

โš ๏ธ Disclaimer

This tool is for educational and research purposes. Please respect website terms of service and rate limits when scraping content.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiwebsearcher-0.1.0.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiwebsearcher-0.1.0-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file aiwebsearcher-0.1.0.tar.gz.

File metadata

  • Download URL: aiwebsearcher-0.1.0.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for aiwebsearcher-0.1.0.tar.gz
Algorithm Hash digest
SHA256 184441c1fc75712b2f1f3332ba30e96c3bc460e2ebaeafe5460367403b0a5aeb
MD5 2a9a559adccb3a15c406d5626783263c
BLAKE2b-256 5a2122e360c38af0dee943ee94202a3930409cfbf1d551dabbd827581cf92aba

See more details on using hashes here.

File details

Details for the file aiwebsearcher-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for aiwebsearcher-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6abe46aabffd2906f6994c84f279387c11cffebe82343036e6d53aa0b241c471
MD5 7346f1bee3a3ce27f7ed69bc0c2e3d99
BLAKE2b-256 1bc584ad9592e89dda5b0706b4bf9dbf076fb1a0870753f8eae5d4f204997cbd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page