Skip to main content

Google Alerts MCP Server for fetching topic-based news and information

Project description

Google Alerts MCP Server

A Google Alerts plugin based on MCP (Model Context Protocol) that retrieves news articles on specific topics by simulating browser workflows. This plugin dynamically extracts state parameters from Google Alerts pages to avoid detection and blocking.

Features

  • Dynamic State Extraction: Automatically extracts window.STATE parameters from Google Alerts pages
  • Anti-Detection Mechanism: Uses fresh tokens and session cookies for each request to avoid blocking
  • Multi-Language Support: Supports searches in Chinese, English, and other languages
  • Browser Workflow Simulation: Fully simulates browser access flow: visit homepage → extract cookies/state → build preview URL
  • No Hardcoded Parameters: All authentication tokens and state parameters are dynamically extracted, no hardcoded values
  • URL Cleaning Feature: Configurable option to remove Google redirect parameters and get direct news links

Workflow

  1. Get Initial Cookies: Visit https://www.google.com/alerts?hl={language} to get initial cookies
  2. State Parameter Extraction: Extract window.STATE parameters from the page, including authentication tokens
  3. Build Preview URL: Use extracted parameters and search query to build preview URL
  4. Get Content: Use correct cookies and state parameters to get preview page
  5. Article Parsing: Extract article titles, URLs, snippets, and sources from HTML response

Installation

Using UV package manager:

# Clone the project
git clone https://github.com/ycrao/google-alerts-mcp.git
cd google-alerts-mcp

# Install dependencies
uv sync

# Activate virtual environment
source .venv/bin/activate  # Windows: .venv\Scripts\activate
# Or run directly
uv run python src/google_alerts_mcp/server.py

Usage

Running as MCP Server

Add to your MCP client configuration:

{
  "mcpServers": {
    "google-alerts": {
      "command": "uv",
      "args": [
        "--directory",
        "/ABSOLUTE/PATH/TO/google-alerts-mcp/src/google_alerts_mcp/",
        "run",
        "server.py"
      ],
      "env": {}
    }
  }
}

Or if published as a package:

{
  "mcpServers": {
    "google-alerts": {
      "command": "uvx",
      "args": ["google-alerts-mcp"],
      "env": {}
    }
  }
}

Available Tools

search_google_alerts

Search Google Alerts news articles for specific topics.

Parameters:

  • query (required): Search query/topic (e.g., "silver", "bitcoin", "artificial intelligence")
  • language (optional): Language code (default: "zh-CN")
  • region (optional): Region code (default: "US")
  • clean_urls (optional): Whether to clean Google redirect parameters for direct links (default: true)

Example:

{
  "query": "bitcoin",
  "language": "en-US",
  "region": "US",
  "clean_urls": true
}

URL Cleaning Feature:

  • When clean_urls=true, automatically removes Google redirect parameters and returns direct news website links
  • When clean_urls=false, preserves original Google redirect URLs
  • Before cleaning: https://www.google.com/url?q=https://example.com/article&sa=U&ved=...
  • After cleaning: https://example.com/article

Technical Details

Dynamic Parameter Extraction

The server extracts the following parameters from window.STATE:

  • domain: Google domain (usually "com")
  • language: Language code from state
  • region: Region code from state
  • number_param: Numeric parameter (varies by language)
  • locale_format: Locale format string
  • token: Authentication token (key to avoiding detection)

Anti-Detection Features

  • Fresh token extraction for each request
  • Session cookie persistence
  • Proper browser headers
  • No hardcoded authentication parameters
  • Graceful degradation when token extraction fails

Testing

Run the test suite to verify functionality:

# Test complete MCP server functionality
python test_mcp_server.py

Test Examples

# Test search functionality
import asyncio
from google_alerts_mcp.server import GoogleAlertsClient

async def test():
    client = GoogleAlertsClient()
    try:
        # Chinese search
        articles = await client.get_preview_content("白银", "zh-CN")
        for article in articles:
            print(f"Title: {article.title}")
            print(f"URL: {article.url}")
            print(f"Snippet: {article.snippet}")
            print(f"Source: {article.source}")
            print("-" * 50)
        
        # English search (with URL cleaning enabled)
        client_clean = GoogleAlertsClient(clean_urls=True)
        articles = await client_clean.get_preview_content("bitcoin", "en-US")
        for article in articles:
            print(f"Title: {article.title}")
            print(f"URL: {article.url}")  # Direct link, no Google redirect parameters
            print(f"Snippet: {article.snippet}")
            print(f"Source: {article.source}")
            print("-" * 50)
        await client_clean.close()
        
        # English search (preserve original URLs)
        client_original = GoogleAlertsClient(clean_urls=False)
        articles = await client_original.get_preview_content("bitcoin", "en-US")
        for article in articles:
            print(f"Title: {article.title}")
            print(f"URL: {article.url}")  # Contains Google redirect parameters
            print(f"Snippet: {article.snippet}")
            print(f"Source: {article.source}")
            print("-" * 50)
        await client_original.close()
    finally:
        await client.close()

asyncio.run(test())

Important Notes

  1. Dynamic Tokens: The system now fully relies on dynamically extracted tokens, no longer using any hardcoded values
  2. URL Cleaning: URL cleaning is enabled by default, can be disabled with clean_urls=false parameter to preserve original Google redirect URLs
  3. Request Frequency: Avoid overly frequent requests, recommend appropriate intervals
  4. Error Handling: If token extraction fails, requests will fail gracefully rather than using outdated hardcoded values
  5. Real-time State: Each search gets fresh state parameters, ensuring optimal anti-detection effectiveness

Dependencies

  • mcp: Model Context Protocol support
  • httpx: Async HTTP client
  • beautifulsoup4: HTML parsing
  • pydantic: Data validation and serialization

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_alerts_mcp-0.1.2.tar.gz (67.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

google_alerts_mcp-0.1.2-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file google_alerts_mcp-0.1.2.tar.gz.

File metadata

  • Download URL: google_alerts_mcp-0.1.2.tar.gz
  • Upload date:
  • Size: 67.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for google_alerts_mcp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2ef0fa911e103bc5afa5c742939e648ca328ed9bde756e3d686e10ac932b9d33
MD5 eb4fa50704b2bcf115ad99e8d1938132
BLAKE2b-256 7159e87aa77c87042b8fbc9728affae6add6a6e38bd824546b7fc095768db944

See more details on using hashes here.

File details

Details for the file google_alerts_mcp-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for google_alerts_mcp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 950b448fdf02a4b9a54ddbee158dbe9fd1f4f68438eabc7aa2258cb452f099a0
MD5 36a1cf7ab72727e2a5c35b846a14bddb
BLAKE2b-256 42e6f09ac796dcf592397400c43d255891274a9e81055b88142ff08fe6d27337

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page