AI-powered Zotero research assistant — 29 MCP tools for literature search, reading analysis, citation management, and review writing.

These details have not been verified by PyPI

Project links

Project description

Zotero Research Assistant

Preface / 写在前面

This project was built to help graduate students and researchers — especially those without a computer science background — leverage AI-enhanced Zotero for more efficient academic workflows. The documentation is deliberately detailed and step-by-step. Cherry Studio was chosen as the primary interaction interface because it provides a user-friendly GUI that doesn't require any terminal expertise. We believe powerful research tools should be accessible to everyone, not just developers.

If you have no programming experience, go directly to docs/cherry-studio-setup.md and follow the instructions step by step. Try to complete it independently — if you get stuck at any point, paste the error message to any AI chatbot (ChatGPT, DeepSeek, Kimi, etc.) and ask for help. Consider this your first step into the world of programming and AI tools. It's easier than you think.

本项目的出发点是帮助没有太多计算机操作基础的学生和科研工作者，让他们也能利用 AI 增强的 Zotero 来提升学术研究效率。因此文档会写得尽量详细、步骤尽量清晰，并且选择了 Cherry Studio 作为主要的交互界面——它提供了友好的图形化操作，不需要使用终端命令行。我们相信，强大的科研工具应该让每个人都能用上，而不只是程序员。

如果你没有编程基础，请直接阅读 docs/cherry-studio-setup.md，跟着里面的步骤一步步操作即可。尽量独立完成——如果遇到问题，把报错信息复制给任意一个 AI 对话工具（ChatGPT、DeepSeek、Kimi 等）寻求帮助。把这次配置当作你接触程序和 AI 工具的第一步，比你想象的简单。

Turn your Zotero library into an AI-powered research engine.

Search by meaning, discover related papers across 200M+ works, get personalized reading recommendations, and manage your entire academic workflow — all through natural language.

Works with Cursor, Claude Desktop, Cherry Studio, Trae, OpenAI Codex CLI, and any MCP-compatible client.

Highlights


29 MCP Tools	One intent per tool — LLMs always pick the right one
Hybrid RAG Search	Keyword + semantic (bge-m3, 100+ languages) + cross-encoder reranking
Multi-Source Discovery	OpenAlex + CrossRef + Semantic Scholar in parallel, Three-Index Verification to prevent fabricated citations
Citation Network Expansion	Corpus-First strategy + forward/backward citations + OpenAlex Related Works
Anti-Hallucination	Zero-fabrication policy with `[MATERIAL GAP]` structural tags; every paper has a verifiable source link
Personalized Recommendations	Learns from your reading activity and annotations to suggest what to read next
Literature Review Generator	Select papers → extract evidence with citations → AI synthesizes thematic review
Smart Tag Suggestions	Auto-analyze metadata to recommend methodology/domain/data tags (confirm before apply)
Argument Finder	Find supporting & opposing evidence for your thesis from your library
CNKI Integration	Optional Chinese literature search with journal-level tags (CSSCI/PKU Core/CSCD)
OA PDF Waterfall	arXiv → Unpaywall → OpenAlex → S2 → CORE → PMC automatic full-text retrieval
Write Safety	All destructive operations require explicit user approval (dry-run by default)

Features
Requirements
Quick Start
Client Setup
Example Workflows
MCP Tools (29)
Configuration
CNKI Setup (Optional)
Updating
Troubleshooting
Architecture
Development
Acknowledgments
Disclaimer / 免责声明
License

Features

Local Library Intelligence

Hybrid search — Zotero keyword search + ChromaDB semantic search, merged with Reciprocal Rank Fusion; fallback to Zotero full-text index
Filter-only search — list papers by year, tags, or collection with an empty query
Cross-encoder reranking — optional ms-marco-MiniLM-L-6-v2 for higher precision
Multilingual — BAAI/bge-m3 embedding (1024-dim, 100+ languages including Chinese and English)
Page-level traceability — retrieved passages include exact PDF page numbers
Full-text & outline — read complete paper text or PDF table of contents
Incremental index sync — version-based diff; auto-sync on MCP startup

Online Literature Discovery

Multi-source search — queries OpenAlex, CrossRef, and Semantic Scholar in parallel with publisher-diverse ranking
Corpus-First strategy — when a paper's reference list is available, the system expands citation networks from those known references as the PRIMARY search strategy, yielding the most relevant results
Discipline filtering — optional fields_of_study parameter constrains results to relevant academic fields (Business, Economics, Sociology, etc.), preventing cross-domain noise
Related paper discovery — provide a paper's title/abstract/keywords → automatically generates tiered pairwise queries → searches all sources → post-filters irrelevant results → returns deduplicated hits in a single call
Three-Index Verification — every result with a DOI is cross-checked against CrossRef, OpenAlex, and Semantic Scholar; papers not findable in ANY index are filtered out to prevent fabricated citations
Source verification — every returned paper includes a verifiable link (DOI URL, Semantic Scholar URL, or CNKI link) so users can independently check authenticity
Anti-hallucination guardrails — structural [MATERIAL GAP] tags in tool outputs when search returns zero results; the AI is instructed to never fabricate citations and must report gaps honestly

CNKI (Chinese Literature)

CNKI integration — optional Chinese journal search via browser automation (disabled by default, enabled on demand)
Journal-level tags — search results include indexing status badges (CSSCI, PKU Core, CSCD, SCI, EI)
Direct Zotero import — export papers from CNKI to Zotero without manual DOI lookup
Paper detail extraction — full metadata (abstract, keywords, DOI, affiliations) from CNKI detail pages
Smart pagination — AI proactively fetches more results when thorough coverage is needed

Reading Insight & Recommendations

Reading status detection — heuristic classification (deep_read / browsed / unread) based on annotation count, notes, and PDF open history (Zotero 7 reader saves reading position, updating attachment timestamps)
Personalized recommendations — identifies your most-engaged papers → queries OpenAlex Related Works + S2 Recommendations in parallel → deduplicates, excludes already-in-library → ranks by cross-seed frequency
Focus topic extraction — surfaces your active research themes from recent reading tags
Literature review generation — select multiple papers → extract relevant passages with page-level citations → structured output for AI to synthesize into a thematic review
Smart tag suggestions — analyzes title/abstract to recommend methodology, domain, and data-type tags; matches against existing library tags; suggest-only (never auto-applies)
Argument finder — given a thesis/claim, searches library for evidence grouped by stance (support/oppose/neutral); heuristic pre-classification with textual signals; designed for writing Discussion sections

Library Management

Add papers — DOI, arXiv, ISBN, BibTeX, or publisher URL (ScienceDirect, Springer, Wiley, …)
Open-access PDF waterfall — arXiv → Unpaywall → OpenAlex → Semantic Scholar → CORE → PMC
Duplicate merge — find by DOI/title, merge with dry-run preview
Annotations — search highlights across the library; create highlights on PDFs
Write safety — all write/delete operations preview first; requires explicit user approval
Hybrid Zotero mode — fast local reads + web API writes (when API key is set)

Requirements

Component	Version / Note
Python	3.11 – 3.13
Zotero	7+ desktop app, running with local API enabled
MCP client	Cursor, Claude Desktop, Cherry Studio, Trae, Codex CLI, etc.
LLM	Any model with tool/function calling (Claude, GPT-4o, DeepSeek, Qwen, Gemini, …)
Disk	~2.5 GB for embedding model (`bge-m3`) on first run
Git	Only needed for Option B (clone from source)

Path tip: Install in a short path without spaces or non-ASCII characters, e.g. ~/zotero-research-assistant (macOS/Linux) or C:\Dev\zotero-research-assistant (Windows).

Quick Start

1. Install

Option A: pip install (recommended for most users)

pip install zotero-research-assistant

With CNKI (Chinese literature) support:

pip install "zotero-research-assistant[cnki]"

After installing, run zra-mcp to start the MCP server. Skip to Step 2.

Option B: Clone from source (for development or customization)

git clone https://github.com/qiobn/zotero-research-assistant.git
cd zotero-research-assistant

Install uv (fast Python package manager) if not already present:

# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows (PowerShell)
irm https://astral.sh/uv/install.ps1 | iex

Create a virtual environment and install:

uv venv .venv --python 3.13      # use 3.12 or 3.11 if unavailable
uv pip install -e .

Verify installation:

# macOS / Linux
source .venv/bin/activate
python -c "from project_a_mcp.server import mcp; print('OK')"

# Windows (PowerShell)
.venv\Scripts\activate
python -c "from project_a_mcp.server import mcp; print('OK')"

First run downloads the embedding model (~2.3 GB). If download is slow, set HF_ENDPOINT=https://hf-mirror.com and retry.

2. Configure Zotero

Enable local API (required):

Open Zotero → Edit → Settings → Advanced
Check "Allow other applications on this computer to communicate with Zotero"
Verify: http://localhost:23119/api/ should return JSON

Set environment variables:

If you used Option B (clone), create a .env file in the project folder:

cp .env.example .env

If you used Option A (pip install), set environment variables in your shell or create a .env file in your working directory.

Minimum for read-only mode (search, read, cite):

ZOTERO_LOCAL=true

For write operations (add papers, notes, tags, collections), also set your Zotero API key:

ZOTERO_LOCAL=true
ZOTERO_LIBRARY_ID=12345678
ZOTERO_API_KEY=your_api_key_here

3. Build the vector index (first time)

The MCP server auto-syncs on startup (ZRA_AUTO_SYNC=true by default). On first launch it will parse all your PDFs and build the semantic index automatically.

If you cloned from source and want to build the index manually:

python scripts/index_library.py

The index is stored in .chroma_db/ (local only). Typical time: ~3–5 min for 100 papers, ~10–15 min for 500 papers.

4. Connect your AI client

See the Client Setup section below for your specific tool.

5. Test the connection

Start Zotero desktop
Open a new chat in your MCP client
Ask: "List all collections in my Zotero library"

If you see your collections, setup is complete.

Client Setup

All clients use stdio transport to connect to the MCP server.

If you installed via pip (Option A): use zra-mcp as the command directly — no path configuration needed.

If you cloned from source (Option B): you need the full path to the Python binary:

Value	macOS / Linux	Windows
Python binary	`<project>/.venv/bin/python`	`<project>\.venv\Scripts\python.exe`
Working directory	`<project>` (full path)	`<project>` (full path)

Replace <project> with your clone path (e.g. /Users/you/zotero-research-assistant or C:\Dev\zotero-research-assistant).

Quick path helper (run inside the project folder):

# macOS / Linux
echo "$(pwd)/.venv/bin/python"

# Windows (PowerShell)
echo "$PWD\.venv\Scripts\python.exe"

The examples below show both pip and source configurations. Use whichever matches your install method.

Cursor

Settings → MCP → Add new MCP server, or add to .cursor/mcp.json:

pip install users:

{
  "mcpServers": {
    "zra-mcp": {
      "command": "zra-mcp"
    }
  }
}

Source install users (macOS/Linux):

{
  "mcpServers": {
    "zra-mcp": {
      "command": "/Users/you/zotero-research-assistant/.venv/bin/python",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "/Users/you/zotero-research-assistant"
    }
  }
}

Source install users (Windows):

{
  "mcpServers": {
    "zra-mcp": {
      "command": "C:\\Dev\\zotero-research-assistant\\.venv\\Scripts\\python.exe",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "C:\\Dev\\zotero-research-assistant"
    }
  }
}

Restart Cursor after adding the config. The MCP tools will appear in Agent mode.

Claude Desktop

Edit claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "zra-mcp": {
      "command": "/Users/you/zotero-research-assistant/.venv/bin/python",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "/Users/you/zotero-research-assistant"
    }
  }
}

Restart Claude Desktop. You should see the MCP tools icon (hammer) in the chat input area.

Cherry Studio

Settings → MCP Servers → Add → JSON mode:

{
  "mcpServers": {
    "zra-mcp": {
      "name": "zra-mcp",
      "type": "stdio",
      "isActive": true,
      "command": "/Users/you/zotero-research-assistant/.venv/bin/python",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "/Users/you/zotero-research-assistant"
    }
  }
}

Windows:

{
  "mcpServers": {
    "zra-mcp": {
      "name": "zra-mcp",
      "type": "stdio",
      "isActive": true,
      "command": "C:\\Dev\\zotero-research-assistant\\.venv\\Scripts\\python.exe",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "C:\\Dev\\zotero-research-assistant"
    }
  }
}

Configure an LLM under Settings → Model Services (DeepSeek, GPT-4o, Claude, Qwen, etc.). Enable the MCP toggle in the chat interface to activate tools.

For a detailed step-by-step guide (including screenshots), see docs/cherry-studio-setup.md.

Trae

Trae supports MCP servers via its settings panel.

Settings → MCP → Add Server:

Field	Value
Name	`zra-mcp`
Transport	stdio
Command	Full path to `.venv/bin/python` (or `.venv\Scripts\python.exe` on Windows)
Arguments	`-m project_a_mcp.server`
Working Directory	Full path to the project root

Or add to your Trae MCP configuration file (.trae/mcp.json in your workspace or global config):

{
  "mcpServers": {
    "zra-mcp": {
      "command": "/Users/you/zotero-research-assistant/.venv/bin/python",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "/Users/you/zotero-research-assistant"
    }
  }
}

Restart Trae after configuration. MCP tools become available in AI chat (Agent mode).

OpenAI Codex CLI

Codex CLI supports MCP servers. Add to your ~/.codex/config.json (or project-level .codex/config.json):

{
  "mcpServers": {
    "zra-mcp": {
      "command": "/Users/you/zotero-research-assistant/.venv/bin/python",
      "args": ["-m", "project_a_mcp.server"],
      "cwd": "/Users/you/zotero-research-assistant"
    }
  }
}

Then run Codex normally — it will discover and use the tools automatically:

codex "Find papers about urban accessibility in my Zotero library"

Other MCP Clients

Any client that supports the MCP stdio transport can connect. The universal config is:

Parameter	Value
Transport	`stdio`
Command	`<project>/.venv/bin/python`
Arguments	`["-m", "project_a_mcp.server"]`
Working directory	`<project>`
Environment	Reads from `<project>/.env` automatically

Example Workflows

Research Discovery

User: Find papers about 15-minute cities published after 2020
  → search_papers (local library)

User: Search online for recent studies on urban green infrastructure
  → search_online_literature (OpenAlex + CrossRef + S2)

User: I'm reading this paper [title, keywords]. Find me related literature.
  → find_related_literature (5 parallel strategies, verified results)

User: Show me who cites this paper and what it references
  → expand_citation_network (forward + backward citations)

Reading & Analysis

User: What does this paper say about the research methodology?
  → get_paper_content (semantic search within paper)

User: Summarize these 5 papers into a literature review about "method evolution"
  → generate_review_note → AI synthesizes thematic review with citations

User: My thesis is "public services are unevenly distributed" — find evidence
  → find_arguments (returns supporting + opposing passages with stance labels)

User: What should I read next?
  → recommend_papers (based on your annotation activity)

Writing & Citing

User: I'm writing: "Walkability is a key indicator of urban quality..." — suggest citations
  → suggest_citations (matches your draft to library evidence)

User: Export BibTeX for the top 3 results
  → export_bibliography

User: Add this paper: 10.1016/j.cities.2025.105902
  → add_paper (preview → confirm → auto-downloads OA PDF)

Library Organization

User: Analyze these papers and suggest tags
  → suggest_tags (methodology/domain/data classification, suggest-only)

User: Tag these papers as "core reading"
  → edit_tags (preview → confirm)

User: Which papers have I actually read? Which are unread?
  → reading_status (heuristic: annotations, notes, PDF open history)

Write safety: all destructive operations (add paper, notes, tags, merge duplicates) always preview first. The assistant asks for explicit confirmation before executing.

MCP Tools (29)

Category	Tools
Discover	`search_papers`, `search_online_literature`, `search_cnki_literature`, `find_related_literature`, `expand_citation_network`, `cnki_paper_detail`, `cnki_navigate_pages`, `find_similar_papers`, `browse_library`, `find_duplicates`, `merge_duplicates`
Read	`get_paper`, `get_paper_content`, `search_annotations`, `create_annotation`
Write	`suggest_citations`, `export_bibliography`, `add_paper`, `cnki_add_to_zotero`
Manage	`add_note`, `edit_tags`, `manage_collections`
Insight	`reading_status`, `recommend_papers`, `generate_review_note`, `generate_reading_note`, `suggest_tags`, `find_arguments`
Admin	`sync_index`

Expand tool details

Discover

search_papers — Primary search in your local library. Hybrid keyword + semantic. Use query="" with year_from / tags for filter-only listing.
search_online_literature — Online discovery (English/international: OpenAlex, CrossRef, Semantic Scholar). Supports fields_of_study for discipline filtering. Default for online search unless user explicitly requests Chinese literature.
search_cnki_literature — CNKI Chinese journal search (optional module, disabled by default). Only triggered when user explicitly requests Chinese papers / 中文文献 / CNKI. Returns journal-level tags (CSSCI, PKU Core, etc.).
find_related_literature — Multi-strategy related paper search. Supports Corpus-First mode (reference_dois parameter), keyword search, citation network expansion, and Semantic Scholar recommendations — all in parallel. Provide a paper's metadata → get deduplicated, Three-Index-Verified results in one call.
expand_citation_network — Find papers via citation relationships (forward & backward citations via OpenAlex). Accepts multiple DOIs for multi-seed expansion.
cnki_paper_detail — Full metadata (abstract, keywords, DOI, affiliations) from a CNKI paper page.
cnki_navigate_pages — Pagination & re-sorting for CNKI results. Used proactively when user needs many papers or deeper search.
find_similar_papers — Similar papers to a known item (by item_key).
browse_library — Collections, tags, recent items.
find_duplicates / merge_duplicates — Detect and merge duplicates (dry-run by default).

Read

get_paper — Metadata + abstract.
get_paper_content — Modes: semantic query, page range, fulltext, outline; optional annotations overlay.
search_annotations — Search highlights/comments across all papers.
create_annotation — Highlight text on a PDF (dry-run by default).

Write & Manage

suggest_citations — Match your draft text to library evidence.
export_bibliography — BibTeX or formatted citations.
add_paper — Import by DOI / arXiv / ISBN / BibTeX / URL (dry-run by default).
cnki_add_to_zotero — Import CNKI papers directly (no DOI needed). Uses CNKI export API + Zotero Connector.
add_note, edit_tags, manage_collections — Library organization (dry-run by default).

Insight

reading_status — Analyze reading progress. Classifies papers as deep_read (≥3 annotations or notes), browsed (PDF opened recently in Zotero reader), or unread. Filter by scope.
recommend_papers — Personalized recommendations. Identifies your most-engaged papers, finds related literature via OpenAlex + S2, deduplicates, and excludes already-in-library papers.
generate_review_note — Extract evidence from multiple papers for literature review. Provide item keys + optional focus topic → returns passages with inline citations (Author, Year, p.X) ready for AI synthesis.
generate_reading_note — Structured reading note for ONE paper. Auto-extracts research question, methodology, data, findings, limitations, and contribution from the PDF. Produces a template the AI refines into a concise note.
suggest_tags — Analyze paper metadata to suggest methodology, domain, and data-type tags. Suggest-only — never auto-applies; user confirms via edit_tags.
find_arguments — Given a claim/thesis, find supporting and opposing evidence from your library. Classifies passages by stance (support/oppose/neutral) with citations. For writing Discussion sections.

Admin

sync_index — Incremental vector index sync. Also runs automatically on MCP startup.

Configuration

Copy .env.example to .env and adjust:

Variable	Default	Description
`ZOTERO_LOCAL`	`true`	Read from local Zotero API (fast)
`ZOTERO_API_KEY`	—	Required for write operations (hybrid mode)
`ZOTERO_LIBRARY_ID`	`0`	Your Zotero user ID
`EMBEDDING_MODEL`	`BAAI/bge-m3`	Sentence-transformer for semantic search
`RERANKER_MODEL`	`cross-encoder/ms-marco-MiniLM-L-6-v2`	Reranker (`none` to disable)
`CHROMA_PERSIST_DIR`	`.chroma_db`	Local vector database path
`ZRA_AUTO_SYNC`	`true`	Auto incremental sync on MCP startup
`SEMANTIC_SCHOLAR_API_KEY`	—	Optional; higher rate limits for online search
`OPENALEX_MAILTO`	—	Optional; polite pool for OpenAlex API
`UNPAYWALL_EMAIL`	—	Optional; Unpaywall OA PDF lookup
`CORE_API_KEY`	—	Optional; CORE repository full-text
`CNKI_ENABLED`	`false`	Enable CNKI browser search (see below)
`CNKI_CDP_URL`	—	Chrome remote debugging URL

All data stays on your machine: Zotero library, .chroma_db/, and HuggingFace model cache (~/.cache/huggingface/).

CNKI Setup (Optional)

CNKI (China National Knowledge Infrastructure) is disabled by default. It is only needed for searching Chinese-language journal papers. When you first ask the AI for Chinese literature (e.g., "search CNKI for…" or "检索中文文献"), it will prompt you to complete the setup below.

CNKI has no public API. This project uses Playwright to connect to your logged-in Chrome browser via CDP (Chrome DevTools Protocol), following the same approach as cookjohn/cnki-skills.

Step 1: Install optional dependencies

uv pip install -e ".[cnki]"
playwright install chromium

Step 2: Start Chrome with remote debugging

# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222

# Linux
google-chrome --remote-debugging-port=9222

Step 3: Log in to CNKI

Open https://www.cnki.net/ in that Chrome window and log in (typically requires institutional VPN or campus network).

Step 4: Enable in `.env`

CNKI_ENABLED=true
CNKI_CDP_URL=http://127.0.0.1:9222

Step 5: Restart the MCP server

Reopen a chat window or restart your MCP client.

Verify

Ask the AI: "Search CNKI for highly-cited papers on geodetector since 2020"

If results appear (with title, authors, journal, citations, and journal level tags like CSSCI/PKU Core), the setup is working.

How it works

search_cnki_literature or find_related_literature(scope="cnki") → returns hits with export_id and journal_level
You select papers → AI calls cnki_add_to_zotero(export_ids=[...]) → papers appear in Zotero
No DOI lookup needed; metadata is fetched from CNKI's internal export API

Notes

Trigger: CNKI tools are only called when you explicitly mention Chinese literature, CNKI, 知网, 核心期刊, CSSCI, etc. Regular online search uses OpenAlex/CrossRef/S2.
Captcha: If a Tencent slider captcha appears, solve it in the Chrome window and retry.
Zotero import: Requires Zotero desktop running (uses localhost:23119 Connector API).
Compliance: Requires legitimate institutional CNKI access.
Before each session: Ensure the Chrome window from Step 2 is still running and the CNKI login is active.

Known Issues & Limitations

⚠️ The CNKI module is currently unstable and disabled by default. It relies on browser automation which is inherently fragile. Known issues include:

Issue	Cause	Workaround
Timeout on search	CNKI pages load slowly; anti-bot throttling	Simplify your query (fewer characters); retry after a few seconds
Chrome connection refused	Chrome was not started with `--remote-debugging-port`, or an existing session conflicted	Close ALL Chrome windows, then restart with `--remote-debugging-port=9222 --user-data-dir="/tmp/chrome-debug-profile"`
Stale login session	CNKI sessions expire after ~30 min of inactivity	Re-login in the Chrome window before retrying
Consecutive timeouts	Rate limiting by CNKI (>3 queries in quick succession)	The tool auto-aborts after 2 consecutive timeouts; wait 30s and retry
Export to Zotero fails	Zotero desktop not running or Connector API port changed	Ensure Zotero is running; verify http://localhost:23119/api/ responds
`incorrect profile type` errors in Chrome log	Normal Chrome warning when using a temporary `--user-data-dir`	Harmless — does not affect functionality

If CNKI consistently fails, fall back to the English-language online search (search_online_literature / find_related_literature) which is stable and does not require browser automation.

Updating

pip users:

pip install --upgrade zotero-research-assistant

Source install users:

cd ~/zotero-research-assistant       # or your clone path
git pull
uv pip install -e .              # if dependencies changed

If using CNKI:

uv pip install -e ".[cnki]"
playwright install chromium

Restart your MCP client to reload the server.

Troubleshooting

Problem	Fix
Connection refused / no results	Ensure Zotero desktop is running and local API is enabled
New papers not found	Say "sync my index" or restart MCP (auto-sync on startup)
Write operations fail	Set `ZOTERO_API_KEY` + `ZOTERO_LIBRARY_ID` in `.env`
Slow first start	Embedding model download (~2.3 GB); use `HF_ENDPOINT=https://hf-mirror.com`
Windows: script blocked	`Set-ExecutionPolicy -Scope CurrentUser RemoteSigned` in PowerShell
MCP tools not called	Use a model with function calling; enable MCP/tools in client settings
AI executes writes without asking	Add to system prompt: "Always wait for explicit confirmation before executing writes"
CNKI: "search is disabled"	Complete the CNKI Setup steps
CNKI: captcha	Solve the slider in the Chrome window, then retry the search

Architecture

research_core/          # Shared library — Zotero client, RAG pipeline, search adapters, tools
project_a_mcp/          # MCP server entry point (stdio transport)
project_b_agent/        # Full-stack agent scaffold (planned)
scripts/                # CLI utilities (index_library.py, etc.)
tests/                  # Unit + integration tests
docs/                   # Detailed setup guides

Each tool maps to one user intent — discovery tools return item_key, read/write tools consume it.

Development

uv pip install -e ".[dev]"
pytest tests/ -v
ruff check .
ruff format .

Run CNKI integration tests (requires active CNKI session):

CNKI_ENABLED=true CNKI_CDP_URL=http://127.0.0.1:9222 pytest tests/mcp/test_cnki.py -v

Acknowledgments

This project was inspired by and built upon ideas from:

zotero-mcp — Pioneering work on connecting Zotero with AI assistants via MCP.
cnki-skills — Elegant approach to CNKI browser automation via Chrome DevTools Protocol.
academic-research-skills — Inspiration for the Corpus-First search strategy and structured anti-hallucination patterns ([MATERIAL GAP] tagging).
nature-skills — Inspiration for the Three-Index Verification approach (cross-checking citations against multiple bibliographic databases).

Thank you to the authors of these projects for sharing their work with the community.

Disclaimer / 免责声明

English

AI output quality depends on the connected model. Although this project implements multiple anti-hallucination mechanisms (Three-Index Verification, [MATERIAL GAP] tagging, source provenance), the final quality of literature reviews, summaries, and recommendations is ultimately determined by the LLM you connect. Always verify AI-generated citations against the original sources before using them in academic work. AI can and does fabricate references — treat all outputs as drafts requiring human verification.
For learning and research purposes only. This project is open-source and intended solely for personal academic research and educational use. It is not commercialized and no profit is derived from it. If any content or functionality inadvertently infringes on intellectual property or terms of service of third-party platforms (CNKI, publishers, etc.), please open an issue and we will address it promptly.
CNKI module compliance. The CNKI browser automation module is provided for convenience only. Users must have legitimate institutional access to CNKI. Automated access may violate CNKI's Terms of Service — use at your own risk and responsibility. This module is disabled by default for this reason.
Data privacy. All processing happens locally by default. Your PDFs are parsed and embedded on your machine (.chroma_db/). However, if you configure a cloud-based embedding model or connect to a cloud LLM, paper content (text chunks, queries) will be sent to those external services. Users working with sensitive or unpublished research should be aware of this.
Trademark notice. "Zotero" is a registered trademark of the Corporation for Digital Scholarship. This project is an independent community tool and is not affiliated with, endorsed by, or officially connected to Zotero or the Corporation for Digital Scholarship.

中文

生成质量取决于接入的大语言模型。 尽管本项目实现了多重防幻觉机制（三索引交叉验证、[MATERIAL GAP] 结构化标记、来源可溯），但文献综述、摘要和推荐的最终质量仍取决于你所使用的 AI 模型。请务必在正式引用前核实 AI 生成的文献是否真实存在。AI 有可能且确实会编造参考文献——请将所有输出视为需要人工核实的草稿。
仅供学习交流使用。 本项目为开源项目，仅用于个人学术研究和学习交流，不作任何商业用途，不从中获取利润。如本项目的任何内容或功能无意中侵犯了第三方平台（知网、出版商等）的知识产权或服务条款，请通过 Issue 及时告知，我们会第一时间处理。
知网模块合规性。 知网浏览器自动化模块仅为便利性而提供。用户必须拥有合法的机构知网访问权限。自动化访问可能违反知网的服务条款——使用风险和责任由用户自行承担。该模块默认关闭正是出于此原因。
数据隐私。 默认情况下所有处理均在本地完成。你的 PDF 在本机上被解析和向量化（存储在 .chroma_db/）。但如果你配置了云端嵌入模型或连接了云端大语言模型，论文内容（文本片段、查询）将被发送至相应的外部服务。处理敏感或未发表研究的用户请注意这一点。
商标声明。 "Zotero" 是 Corporation for Digital Scholarship 的注册商标。本项目是独立的社区工具，与 Zotero 或 Corporation for Digital Scholarship 没有任何关联、背书或官方联系。

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Jun 11, 2026

0.1.2

Jun 10, 2026

This version

0.1.1

Jun 4, 2026

0.1.0

Jun 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zotero_research_assistant-0.1.1.tar.gz (135.7 kB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zotero_research_assistant-0.1.1-py3-none-any.whl (120.5 kB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file zotero_research_assistant-0.1.1.tar.gz.

File metadata

Download URL: zotero_research_assistant-0.1.1.tar.gz
Upload date: Jun 4, 2026
Size: 135.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for zotero_research_assistant-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`666154e74b736c1f753a3c31cad85bed86b14b22ee9cc0096f74f53a457ba173`
MD5	`8f15500ef7ccd2adcccc7c8d9176866c`
BLAKE2b-256	`32d1bca4ca321c9e965b8d70d08f92b94c1d74acf70fca865c351b855578a55d`

See more details on using hashes here.

File details

Details for the file zotero_research_assistant-0.1.1-py3-none-any.whl.

File metadata

Download URL: zotero_research_assistant-0.1.1-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 120.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for zotero_research_assistant-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`13aa873a79db1877e9b27dcec83307a55b6861b66b355a2934aef779d985b3fc`
MD5	`ba1905008c0d29a0a88a980eedf53640`
BLAKE2b-256	`d50a807e818c6b02f43dcd3d5231253e4b2e351cc72e7bc6f4b093d953ab4e8c`

See more details on using hashes here.

zotero-research-assistant 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Zotero Research Assistant

Preface / 写在前面

Highlights

Table of Contents

Features

Local Library Intelligence

Online Literature Discovery

CNKI (Chinese Literature)

Reading Insight & Recommendations

Library Management

Requirements

Quick Start

1. Install

2. Configure Zotero

3. Build the vector index (first time)

4. Connect your AI client

5. Test the connection

Client Setup

Cursor

Claude Desktop

Cherry Studio

Trae

OpenAI Codex CLI

Other MCP Clients

Example Workflows

Research Discovery

Reading & Analysis

Writing & Citing

Library Organization

MCP Tools (29)

Discover

Read

Write & Manage

Insight

Admin

Configuration

CNKI Setup (Optional)

Step 1: Install optional dependencies

Step 2: Start Chrome with remote debugging

Step 3: Log in to CNKI

Step 4: Enable in .env

Step 5: Restart the MCP server

Verify

How it works

Notes

Known Issues & Limitations

Updating

Troubleshooting

Architecture

Development

Acknowledgments

Disclaimer / 免责声明

English

中文

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

Step 4: Enable in `.env`