Skip to main content

MCP server for searching Google Scholar — papers, authors, citations, and BibTeX

Project description

google-scholar-search-mcp

An MCP (Model Context Protocol) server for searching Google Scholar, built for AI assistants and automation workflows that need papers, authors, citations, and BibTeX entries.

Table of Contents

  1. Features
  2. Installation
  3. Configuration
  4. Usage
  5. Examples
  6. Rate Limiting
  7. Troubleshooting
  8. Contributing

Features

  • Paper Search: Query Google Scholar by keyword with filtering, sorting, and pagination
  • Author Lookup: Find researcher profiles with publication lists and h-index metrics
  • Citation Tracking: Retrieve papers that cite a given work
  • Paper Details: Get full metadata, citations-per-year graphs, and public access info
  • BibTeX Export: Generate citation entries in BibTeX format
  • Bulk Search: Batch search multiple queries with automatic rate limiting
  • Rate Limiting: Built-in delays between requests to avoid being blocked
  • Proxy Support: Optional proxy configuration (free, single, or ScraperAPI)

Installation

Requirements

  • Python 3.11 or later
  • Dependencies: mcp[cli]>=1.4.0, scholarly>=1.7.11, pydantic>=2.0 (see pyproject.toml)
    • project uses uv for dependency management

Install it from PyPI

pip install google-scholar-search-mcp

Build from Source

git clone https://github.com/LWaetzig/google-scholar-search-mcp.git
cd google-scholar-search-mcp
pip install -e .

Note: This server uses the scholarly library to access Google Scholar. Respect Google's Terms of Service and use rate limiting appropriately to avoid being blocked.

Configuration

Configure the MCP server via environment variables:

Variable Default Description
GS_MIN_DELAY 5.0 Minimum seconds between requests
GS_MAX_DELAY 15.0 Maximum seconds between requests
GS_MAX_RETRIES 3 Number of retries on failure
GS_PROXY_TYPE none Proxy mode: none, free, single, scraperapi
GS_PROXY_HTTP HTTP proxy URL (for single mode)
GS_PROXY_HTTPS HTTPS proxy URL (for single mode)
GS_SCRAPERAPI_KEY ScraperAPI key (for scraperapi mode)
GS_TIMEOUT 30 Request timeout in seconds

Proxy Configuration Examples

No Proxy (Default)

export GS_PROXY_TYPE=none

Free Proxy

export GS_PROXY_TYPE=free

Single Proxy

export GS_PROXY_TYPE=single
export GS_PROXY_HTTP=http://proxy.example.com:8080
export GS_PROXY_HTTPS=https://proxy.example.com:8080

ScraperAPI

export GS_PROXY_TYPE=scraperapi
export GS_SCRAPERAPI_KEY=your_key_here

Usage

Detailed documentation about single tools can be found on the project repository: docs/tool-references.md

Integration with Claude Desktop

Add the server to your Claude Desktop configuration:

Platform Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json

Add the google_scholar_mcp entry under mcpServers, replacing the path with the absolute path to your clone:

{
  "mcpServers": {
    "google-scholar": {
      "command": "python",
      "args": ["-m", "google_scholar_mcp.server"],
      "env": {
        "GS_MIN_DELAY": "5.0",
        "GS_MAX_DELAY": "15.0",
        "GS_PROXY_TYPE": "none"
      }
    }
  }
}

After updating the config, restart Claude Desktop. The Google Scholar tools will appear in the MCP Tools panel.

Integration with Other MCP Clients

Any MCP client (e.g., Cline, Continue, or custom tools) can use this server. Configure the connection to:

Command: python -m google_scholar_mcp.server
Transport: stdio

Rate Limiting

The server automatically enforces rate limiting between requests to avoid overloading Google Scholar's servers:

  • Min Delay (default 5s): Minimum wait between consecutive requests
  • Max Delay (default 15s): Maximum wait (randomized to avoid patterns)
  • Max Retries (default 3): Retry failed requests up to this many times

These settings help prevent being blocked by Google Scholar. Adjust via environment variables if needed:

export GS_MIN_DELAY=3.0
export GS_MAX_DELAY=10.0
export GS_MAX_RETRIES=5

⚠️ IP Blocking Warning

If you exceed Google Scholar's rate limits despite the rate limiter:

  • Your IP may be temporarily blocked (usually 24-48 hours)
  • All requests will fail with connection errors or 429 responses
  • Blocked IPs cannot make requests even with valid proxies on the same IP range
  • Repeated violations may trigger permanent blocks or require CAPTCHA solving

Recommended Practices:

  1. Never decrease delays below 5 seconds — the defaults are tuned for reliability
  2. Use the bulk_search tool instead of rapid sequential searches — it includes built-in delays
  3. Add extra buffer during bulk operations — consider setting GS_MIN_DELAY=10.0 for large jobs
  4. Use a proxy service (free proxy or ScraperAPI) to distribute requests across multiple IPs
  5. Monitor for 429 errors — if you see them, increase delays immediately and wait before retrying
  6. Spread requests over time — don't run 100 queries in 5 minutes, even with delays

Recovery from IP Blocks

If your IP gets blocked:

  • Wait 24-48 hours for the temporary block to expire
  • Use a proxy — enable GS_PROXY_TYPE=free or scraperapi to route through different IPs
  • Change your network — use a different WiFi/ISP temporarily if possible
  • Contact support — for persistent blocks, escalate to Google Scholar support

Choosing Appropriate Delays

Scenario GS_MIN_DELAY GS_MAX_DELAY Notes
Single searches 5.0 15.0 Default; safe for occasional queries
Bulk operations 10.0 20.0 Use for batch jobs; prevents rapid-fire requests
Heavy load 15.0 30.0 Use with proxy for large-scale research
Aggressive ⚠️ <5.0 <10.0 Not recommended; high risk of IP blocking

Troubleshooting

"Error: 429 Too Many Requests"

You've hit Google Scholar's rate limit. Solutions:

  1. Increase delays: Set higher GS_MIN_DELAY and GS_MAX_DELAY
  2. Use a proxy: Set GS_PROXY_TYPE=free or use ScraperAPI
  3. Wait and retry: Google Scholar may be temporarily blocking; try again later

"No results found"

  • Check your query syntax (Google Scholar supports advanced search operators)
  • Ensure the author/paper name is spelled correctly
  • Try a simpler query with fewer keywords

"Connection timeout"

  • Increase GS_TIMEOUT if your network is slow
  • Check your internet connection
  • Verify proxy settings if using a proxy

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes with clear messages
  4. Push to your fork
  5. Open a pull request

Support

For issues, questions, or feature requests, please open an issue on GitHub.

License

See LICENSE file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_scholar_search_mcp-1.0.0.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

google_scholar_search_mcp-1.0.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file google_scholar_search_mcp-1.0.0.tar.gz.

File metadata

File hashes

Hashes for google_scholar_search_mcp-1.0.0.tar.gz
Algorithm Hash digest
SHA256 6036626467649c28a07446387e4a21ff76ebe2ba26f262e5bf8e257de43a6791
MD5 b7bb1a188d3e55c5851a421fdf034e43
BLAKE2b-256 13eab5b200e17413970f0628438465fea11a8a58229fb013ce1108af5d504fc3

See more details on using hashes here.

Provenance

The following attestation bundles were made for google_scholar_search_mcp-1.0.0.tar.gz:

Publisher: pypi-release.yml on LWaetzig/google-scholar-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file google_scholar_search_mcp-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for google_scholar_search_mcp-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fbe72de426962230c00af11f61ed6ed733cf0e1ccafc73317c54eeb3a50ec8b3
MD5 9a21da0b6ae1fc23607dfb8b089f8412
BLAKE2b-256 262f434a04e37ae3fc71b06a1fe4fcb1c9a5175cb19e8873a6bf49b31d3c58d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for google_scholar_search_mcp-1.0.0-py3-none-any.whl:

Publisher: pypi-release.yml on LWaetzig/google-scholar-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page