Skip to main content

A Model Context Protocol (MCP) server providing AI-powered Baidu search with intelligent reranking and web content extraction

Project description

Search MCP Server

License: MIT Python 3.10+

A powerful Model Context Protocol (MCP) server providing AI-enhanced Baidu search with intelligent reranking and comprehensive web content extraction capabilities.

โœจ Features

  • ๐Ÿ” Baidu Search Integration: Fast and reliable search results from Baidu
  • ๐Ÿค– AI-Powered Reranking: Uses multiple AI agents (Qwen) to intelligently rerank search results by relevance
  • ๐Ÿ“„ Web Content Extraction: Extract clean, readable text from web pages with pagination support
  • ๐ŸŽฏ Batch Processing: Extract content from multiple URLs simultaneously
  • ๐ŸŒ MCP Standard: Fully compliant with Model Context Protocol for seamless integration

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.10 or higher
  • uv (recommended) or pip
  • DashScope API key (for AI search features)

Installation

Using uv (Recommended)

# Clone the repository
git clone https://github.com/Vist233/Google-Search-Tool.git
cd search-mcp

# Install with uv
uv pip install -e .

Using pip

pip install -e .

Environment Setup

Create a .env file or set environment variables for AI features:

export DASHSCOPE_API_KEY="your-api-key-here"

๐Ÿ“– Usage

As an MCP Server

Add to your MCP client configuration (e.g., Claude Desktop):

For macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

For Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "aiwebsearcher": {
      "command": "uvx",
      "args": [
        "aiwebsearcher"
      ]
    }
  }
}

Note: API key is read from environment variable DASHSCOPE_API_KEY. Set it before running:

# macOS/Linux
export DASHSCOPE_API_KEY="your-api-key-here"

# Windows (PowerShell)
$env:DASHSCOPE_API_KEY="your-api-key-here"

Standalone Testing

# Install the package
pip install aiwebsearcher

# Set API key
export DASHSCOPE_API_KEY="your-key"

# Run the server
aiwebsearcher

๐Ÿ› ๏ธ Available Tools

1. search_baidu

Execute basic Baidu search and return structured results.

Parameters:

  • query (str): Search keyword
  • max_results (int, optional): Maximum results to return (default: 5)
  • language (str, optional): Search language (default: "zh")

Returns: JSON string with title, url, and abstract for each result.

Example:

{
  "query": "ไบบๅทฅๆ™บ่ƒฝๅ‘ๅฑ•็Žฐ็Šถ",
  "max_results": 5
}

2. AI_search_baidu

AI-enhanced search with intelligent reranking and content extraction. Takes ~3x longer but provides higher quality, ranked results with full page content.

Parameters:

  • query (str): Search keyword
  • max_results (int, optional): Initial results to fetch (default: 5, recommended 5+)
  • language (str, optional): Search language (default: "zh")

Returns: JSON string with rank, title, url, and Content (full page text) for each result.

Example:

{
  "query": "AIๅ‘ๅฑ•่ถ‹ๅŠฟ 2025",
  "max_results": 12
}

3. extractTextFromUrl

Extract clean, readable text from a single webpage.

Parameters:

  • url (str): Target webpage URL
  • follow_pagination (bool, optional): Follow rel="next" links (default: true)
  • pagination_limit (int, optional): Max pagination depth (default: 3)
  • timeout (float, optional): HTTP timeout in seconds (default: 10.0)
  • user_agent (str, optional): Custom User-Agent header
  • regular_expressions (list[str], optional): Regex patterns to filter text

Returns: Extracted text content as string.

4. extractTextFromUrls

Extract text from multiple webpages in batch.

Parameters: Same as extractTextFromUrl, plus:

  • urls (list[str]): List of target URLs

Returns: Combined text from all URLs, separated by double newlines.

๐Ÿ—๏ธ Project Structure

search-mcp/
โ”œโ”€โ”€ searcher/
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ server.py              # MCP server entry point
โ”‚       โ”œโ”€โ”€ FetchPage/
โ”‚       โ”‚   โ””โ”€โ”€ fetchWeb.py        # Web content extraction
โ”‚       โ”œโ”€โ”€ WebSearch/
โ”‚       โ”‚   โ”œโ”€โ”€ baiduSearchTool.py # Baidu search implementation
โ”‚       โ”‚   โ””โ”€โ”€ SearchAgent.py     # AI agent definitions (legacy)
โ”‚       โ””โ”€โ”€ useAI2Search/
โ”‚           โ””โ”€โ”€ SearchAgent.py     # AI-powered search orchestration
โ”œโ”€โ”€ tests/                         # Test files
โ”œโ”€โ”€ pyproject.toml                # Project configuration
โ”œโ”€โ”€ requirements.txt              # Dependencies
โ””โ”€โ”€ README.md                     # This file

๐Ÿ”ง Development

Install Development Dependencies

uv pip install -e ".[dev]"

Run Tests

pytest

Code Formatting

# Format with black
black searcher/

# Lint with ruff
ruff check searcher/

๐Ÿ“ Configuration

MCP Client Configuration Examples

Minimal configuration:

{
  "mcpServers": {
    "search": {
      "command": "python",
      "args": ["server.py"],
      "cwd": "/path/to/search-mcp/searcher/src"
    }
  }
}

With uv for dependency isolation:

{
  "mcpServers": {
    "search": {
      "command": "uv",
      "args": ["--directory", "/path/to/search-mcp/searcher/src", "run", "python", "server.py"]
    }
  }
}

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ“ฎ Contact

โš ๏ธ Disclaimer

This tool is for educational and research purposes. Please respect website terms of service and rate limits when scraping content.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiwebsearcher-0.1.1.tar.gz (30.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiwebsearcher-0.1.1-py3-none-any.whl (19.6 kB view details)

Uploaded Python 3

File details

Details for the file aiwebsearcher-0.1.1.tar.gz.

File metadata

  • Download URL: aiwebsearcher-0.1.1.tar.gz
  • Upload date:
  • Size: 30.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for aiwebsearcher-0.1.1.tar.gz
Algorithm Hash digest
SHA256 704be2c74502793ec4fdfef76e0d9c2239ae1a341262444bf0a0dbafdf5dbc00
MD5 24d5679b259d9d01e456bae300d287b9
BLAKE2b-256 92ffc219dc6be40620c3c0267591d34a06332b77a60d41da940ea49f151ce0fa

See more details on using hashes here.

File details

Details for the file aiwebsearcher-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for aiwebsearcher-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0a8b002ec3d7beb095dd153fc42b9b27aa39a7b4ba19484000ac4241026d4610
MD5 d4cb3a087867aa9208346fbb725c56cf
BLAKE2b-256 e96be22a429174c115e4ef06ce55fce6dfac7a3d2a5172e27a152051d749ed29

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page