An MCP server for searching and retrieving articles from Google Scholar
Project description
Google Scholar MCP Server
🔍 让 AI 助手通过 MCP 协议搜索和访问 Google Scholar 学术论文。
Google Scholar MCP Server 是一个基于 Model Context Protocol (MCP) 的服务器,为 AI 助手提供搜索 Google Scholar 学术论文的能力。
✨ 核心功能
- 🔎 论文搜索:支持关键词搜索、作者筛选、年份范围筛选
- 📄 完整摘要:搜索单篇论文时可获取完整摘要内容
- 🛡️ 验证码处理:优化了 scholarly 库的 CAPTCHA 验证码处理机制,遇到验证码时自动弹出浏览器窗口供手动验证
- 📚 BibTeX 支持:可获取论文的 BibTeX 引用格式
- 🚀 高效检索:快速获取论文元数据
🛠️ MCP 工具
search_google_scholar
搜索 Google Scholar 上的学术文章。
参数:
| 参数 | 类型 | 必填 | 说明 |
|---|---|---|---|
query |
string | ✅ | 搜索关键词(论文标题、主题或关键词) |
author |
string | ❌ | 作者名称筛选 |
year_low |
int | ❌ | 起始年份 |
year_high |
int | ❌ | 结束年份 |
num_results |
int | ❌ | 返回结果数量(默认: 5) |
返回结果:
{
"bib": {
"title": "论文标题",
"author": "作者",
"pub_year": "发表年份",
"venue": "发表期刊/会议",
"abstract": "摘要"
},
"pub_url": "论文链接",
"num_citations": "被引用次数",
"citedby_url": "引用链接",
"eprint_url": "PDF 链接"
}
📚 使用示例
🤖 mcp客户端调用
关键词搜索:
result = await mcp.use_tool("search_google_scholar", {
"query": "deep learning",
"num_results": 5
})
带作者筛选:
result = await mcp.use_tool("search_google_scholar", {
"query": "convolutional neural networks",
"author": "Yann LeCun",
"num_results": 3
})
带年份范围:
result = await mcp.use_tool("search_google_scholar", {
"query": "transformer",
"year_low": 2020,
"year_high": 2024,
"num_results": 5
})
组合搜索:
result = await mcp.use_tool("search_google_scholar", {
"query": "neural networks",
"author": "Geoffrey Hinton",
"year_low": 2015,
"year_high": 2023,
"num_results": 10
})
🐍 作为 Python 包直接调用
你可以直接在 Python/Jupyter 中导入并调用:
from google_scholar_mcp import search_google_scholar
results = search_google_scholar("attention is all you need",
author="Vaswani",
year_low=2017,
year_high=2018,
num_results=2 )
print(results)
🚀 快速开始
如需批量获取 BibTeX 或处理结果,可遍历 results。
日志静默说明
默认只输出 CRITICAL 日志,用户无需修改源码即可静默 info 日志。 如需调试,可在代码中动态开启 info 日志:
from google_scholar_mcp.scholarly import set_logger
set_logger(True) # 开启 info 日志
set_logger(False) # 恢复静默
方式一:从 PyPI 安装(Coming Soon)
uv tool install google_scholar_mcp
方式二:从 GitHub 安装
uv tool install git+https://github.com/arrogant-R/google_scholar_mcp.git
方式三:本地安装
- 克隆仓库:
git clone https://github.com/arrogant-R/google_scholar_mcp.git
cd Google-Scholar-MCP-Server
- 创建虚拟环境并安装依赖:
# 使用 uv(推荐)
uv venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -e .
# 或使用 pip
pip install -r requirements.txt
- 启动服务:
# 作为模块运行
python -m google_scholar_mcp
⚙️ 配置 MCP 客户端
方式一:使用 uv(从 GitHub 安装后)
{
"mcpServers": {
"google-scholar": {
"command": "uv",
"args": [
"tool",
"run",
"google_scholar_mcp"
]
}
}
}
方式二:本地开发模式(uv)
{
"mcpServers": {
"google-scholar": {
"command": "uv",
"args": [
"--directory",
"/path/to/Google-Scholar-MCP-Server",
"run",
"google_scholar_mcp"
]
}
}
}
方式三:使用本地 Python
{
"mcpServers": {
"google-scholar": {
"command": "/path/to/python",
"args": [
"/path/to/Google-Scholar-MCP-Server/google_scholar_server.py"
]
}
}
}
VS Code (GitHub Copilot)
在 VS Code 的 mcp.json 配置文件中添加:
使用 uv:
{
"servers": {
"google_scholar": {
"type": "stdio",
"command": "uv",
"args": [
"tool",
"run",
"google_scholar_mcp"
]
}
}
}
使用本地 Python:
{
"servers": {
"google_scholar": {
"type": "stdio",
"command": "D:/path/to/python.exe",
"args": [
"D:/path/to/Google-Scholar-MCP-Server/google_scholar_server.py"
]
}
}
}
Claude Desktop
使用 uv(推荐):
{
"mcpServers": {
"google-scholar": {
"command": "uv",
"args": [
"tool",
"run",
"google_scholar_mcp"
]
}
}
}
或使用 Python:
Mac OS:
{
"mcpServers": {
"google-scholar": {
"command": "python",
"args": ["-m", "google_scholar_mcp"]
}
}
}
Windows:
{
"mcpServers": {
"google-scholar": {
"command": "C:\\Users\\YOUR\\PATH\\python.exe",
"args": [
"D:\\path\\to\\Google-Scholar-MCP-Server\\google_scholar_server.py"
]
}
}
}
Cursor
在 Settings → Cursor Settings → MCP → Add new server 中添加:
{
"google-scholar": {
"command": "uv",
"args": [
"tool",
"run",
"google-scholar-mcp"
]
}
}
Cline
{
"mcpServers": {
"google-scholar": {
"command": "uv",
"args": [
"tool",
"run",
"google-scholar-mcp"
]
}
}
}
优化
🛡️ CAPTCHA 验证码处理
本项目对 scholarly 库进行了优化,解决了遇到 Google Scholar 验证码时程序卡住的问题:
- 自动检测验证码:当检测到验证码时,自动弹出浏览器窗口
- 手动验证:在弹出的浏览器中手动完成验证
- Cookie 同步:验证完成后后续请求使用自动同步的 Cookie,避免频繁触发验证
- Cookie 持久化:将 Cookie 保存到本地文件,下次启动时自动加载,减少验证码出现频率
⚠️ 如果遇到验证码,请在弹出的浏览器窗口中手动完成验证,程序会自动等待并继续执行。
📄 完整摘要获取
当进行精确搜索(num_results=1)时,系统会自动获取论文的完整 abstract,而不仅仅是截断的摘要片段。这对于需要详细了解单篇论文内容的场景非常有用。
📁 项目结构
Google-Scholar-MCP-Server/
├── src/
│ └── google_scholar_mcp/
│ ├── __init__.py # 包入口
│ ├── __main__.py # 主入口
│ ├── server.py # MCP 服务器实现
│ ├── search.py # 搜索逻辑
│ └── scholarly/ # 修改版 scholarly 库
│ ├── _navigator.py
│ ├── _proxy_generator.py
│ ├── _scholarly.py
│ └── ...
├── requirements.txt # 依赖列表
└── pyproject.toml # 项目配置(支持 uv/pip 安装)
🔧 依赖
- Python 3.10+
- mcp
- requests
- beautifulsoup4
- selenium
- httpx
- fake_useragent
- 等(详见 requirements.txt)
🤝 贡献
欢迎提交 Pull Request 和 Issue!
📄 许可证
本项目采用 MIT 许可证。
⚠️ 免责声明
本工具仅供学术研究使用。请遵守 Google Scholar 的服务条款,合理使用本工具。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file google_scholar_mcp-0.1.1.tar.gz.
File metadata
- Download URL: google_scholar_mcp-0.1.1.tar.gz
- Upload date:
- Size: 44.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e0aba098626d9fcc4c72bf83108459f8030adf2fb898b94d39f5079fce92fba
|
|
| MD5 |
8f5874130587766947e871cf6698393f
|
|
| BLAKE2b-256 |
7f8b77708d19ef3ea654f5bc89fc674518e039ca2fcffbba6343f892d7e003a2
|
File details
Details for the file google_scholar_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: google_scholar_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 46.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a87f96b8ccb4dd5c2a91440179e9dc19c054253ef5de66651ae6252c2ebee62
|
|
| MD5 |
1b53bfb44d9498524bef62279d91b692
|
|
| BLAKE2b-256 |
0fbc6a9dfc2a6f7dee378b3f4f1699f071a806c7151fccb0c60f28b329eccf17
|