AI-operable research workspace for Zotero, Obsidian, and NotebookLM. Use any two, or all three, through CLI, MCP, REST, and dashboard.
Project description
research-hub
Turn your research stack into an AI-operable workspace. Use Zotero, Obsidian, and NotebookLM together, or start with any two. research-hub gives your AI assistant a real CLI, MCP server, REST API, and dashboard for repeatable literature workflows.
Traditional Chinese: README.zh-TW.md | Watch the full-res mp4
📚 Part of the agentic AI learning roadmap — a 7-stage curated path for building agentic AI, multilingual (zh-TW · zh-Hans · English). This workspace is referenced in §13 (research workflow skills).
🧪 Real-use signal: in daily use by 1 PhD researcher (Lehigh CEE) tracking 7+ research clusters across Zotero + Obsidian + NotebookLM. Shipping since Apr 2026, docs updated for v0.89.0.
Install + first run
Pick the path that matches the operator: a human researcher or the autonomous agent itself.
Personae
research-hub supports two primary user personae:
- Human researcher (Wei-Ling persona): hydrology postdoc, knows Python pip + DOIs, never touched Claude / MCP / Obsidian. Start with Human quickstart.
- Autonomous agent (Claude Cowork / OpenClaw / Hermes host): the AI itself is the operator, not a human. Start with Autonomous agent quickstart.
Required env vars
| Name | Required | Purpose |
|---|---|---|
ZOTERO_API_KEY |
yes | Zotero web API auth, required for paper ingestion |
ZOTERO_LIBRARY_ID |
yes | Zotero library identifier |
SEMANTIC_SCHOLAR_API_KEY |
no | Lifts S2 rate limit from shared anonymous to 1 req/sec dedicated |
TAVILY_API_KEY |
no | Web search backend (alternative to DDG) |
BRAVE_API_KEY |
no | Web search backend (alternative to DDG) |
Autonomous agent quickstart
For Cowork-style hosts:
pip install research-hub-pipeline
python -m research_hub describe > capabilities.json
python -m research_hub setup --autonomous --vault ./vault --persona agent
# emits BootstrapReport JSON; exit code 0 if ready, 1 otherwise
Then drive operations via CLI --json mode or the bundled MCP server (research-hub-mcp). All report-shaped commands accept --json; capability introspection lives in research-hub describe.
Note: NotebookLM upload still requires one-time human-driven research-hub notebooklm login browser-based Google OAuth. Headless agent completion is upstream-blocked by Google's auth flow.
Human quickstart
| You already have | First command |
|---|---|
| Zotero + Obsidian + NotebookLM | pip install research-hub-pipeline[playwright,secrets] then research-hub setup |
| Zotero + Obsidian, no NotebookLM | pip install research-hub-pipeline[secrets] then research-hub setup --skip-login |
| Obsidian + local PDFs only | pip install research-hub-pipeline[import,secrets] then research-hub setup --persona analyst |
| Nothing yet | pip install research-hub-pipeline then research-hub dashboard --sample |
Python 3.10+ is required. Add [mcp] if you want standalone MCP server dependencies.
| Persona | Best for | Install extra |
|---|---|---|
| Researcher | STEM papers, DOI/arXiv, Zotero-first workflows | [playwright,secrets] |
| Humanities | books, quotes, URL-only sources, Zotero + Obsidian | [playwright,secrets] |
| Analyst | industry research, local PDFs/reports, no Zotero required | [import,secrets] |
| Internal KM | lab/company knowledge bases, mixed file types | [import,secrets] |
Field presets for discover new, search, and related planning flows are cs, bio, med, physics, math, social, econ, chem, astro, edu, and general. There is no hydrology preset; use general intentionally.
Why this exists
Most research tools are good at one part of the workflow:
- Zotero stores citations, metadata, and PDFs.
- Obsidian stores notes, links, and synthesis.
- NotebookLM turns source bundles into AI-readable briefs.
The painful part is the handoff. research-hub connects those handoffs so an AI agent can search, ingest, tag, summarize, repair, brief, and inspect your workspace without turning your library into an opaque RAG box.
You do not need all three tools on day one.
| Your current stack | What research-hub gives you first |
|---|---|
| Zotero + Obsidian | Paper search, Zotero metadata, Markdown notes, tags, Obsidian Bases dashboards |
| Obsidian + NotebookLM | Local PDF/DOCX/MD/TXT ingest, cluster dashboards, NotebookLM bundles and briefs |
| Zotero + NotebookLM | Zotero-backed paper selection, namespaced tags, NotebookLM upload/generate/download |
| Zotero + Obsidian + NotebookLM | Full loop: discover -> ingest -> organize -> brief -> answer -> maintain |
| No accounts yet | Sample dashboard and local smoke tests before connecting anything |
What it does
research-hub is a local-first orchestration layer for research workflows:
- CLI:
research-hub auto,import-folder,ask,doctor,tidy,clusters,zotero,notebooklm,crystal, and more. - MCP server: lets Claude Desktop, Claude Code, Cursor, Continue.dev, Cline, Roo Code, OpenClaw, and other MCP hosts operate the same workflow.
- REST API: exposes
/api/v1/*for browser-only or HTTP-capable assistants. - Dashboard: gives humans a live view of clusters, papers, diagnostics, briefs, writing support, and management actions.
- Vault format: writes normal Markdown, frontmatter,
.basedashboards, cache files, and logs that you can inspect directly.
The core loop:
topic or source folder
-> discover or import sources
-> enrich metadata
-> write Zotero tags/notes when enabled
-> write Obsidian Markdown notes and cluster dashboards
-> bundle/upload/generate with NotebookLM when enabled
-> cache answers as crystals and structured memory
Is this for me? — vs alternatives
research-hub does not replace Zotero, Obsidian, or NotebookLM. It connects them so an AI agent can operate the workflow.
| What you can do | Zotero alone | NotebookLM alone | Generic RAG | Obsidian-Zotero plugin | research-hub |
|---|---|---|---|---|---|
| Search arXiv + Semantic Scholar in one command | No | No | DIY | No | Yes |
| Ingest into Zotero and Obsidian and NotebookLM | No | No | DIY | Partial | Yes |
| AI brief from your collection | No | Manual | DIY | No | Yes |
| Cached canonical answers | No | No | Re-fetches | No | Yes |
| Structured memory layer | No | No | Usually chunks | No | Yes |
| Direct AI-agent control via MCP | No | No | DIY | No | Yes |
| Live dashboard with action buttons | No | No | No | No | Yes |
| Per-cluster Obsidian Bases dashboard | No | No | No | No | Yes |
| No OpenAI/Anthropic API key required | n/a | Yes | Usually no | n/a | Yes |
| Local-first vault you own | Partial | No | Depends | Yes | Yes |
The practical fit: research-hub is most useful if you already use at least two of Zotero, Obsidian, and NotebookLM and want your AI assistant to run the repetitive steps.
Connect your AI host
For Claude Desktop, Cursor, Continue.dev, Cline, VS Code Copilot, OpenClaw, or another MCP host:
{ "mcpServers": { "research-hub": { "command": "research-hub", "args": ["serve"] } } }
Restart the host. Then ask naturally:
Find me 5 papers on agent-based modeling and put them in a notebook.
The AI can call auto_research_topic(topic="agent-based modeling", max_papers=5) and ingest papers, generate a NotebookLM brief, and update the vault.
Install host-specific skill files:
research-hub install --platform claude-code
research-hub install --platform cursor
research-hub install --platform codex
research-hub install --platform gemini
Browser-only or HTTP-capable AIs can use the REST API:
curl -X POST http://127.0.0.1:8765/api/v1/plan \
-H "Content-Type: application/json" \
-d "{\"intent\":\"research harness engineering\"}"
Full reference: MCP tools and AI integrations.
Dashboard tour
research-hub serve --dashboard opens http://127.0.0.1:8765/.
Overview: treemap over clusters, storage map, and health summary.
Library: per-cluster drill-down with papers, sub-topics, and per-paper actions.
Diagnostics: grouped drift alerts and readiness checks.
Manage: CLI actions as buttons, inline result drawer, confirmation modal, and per-paper row actions.
Briefings and Writing tabs are also available. See the dashboard walkthrough and persona variants.
Inside Obsidian
Every ingested paper becomes a real Markdown note with structured frontmatter. Every cluster can also get an Obsidian Bases dashboard.
Cluster Bases dashboard: generated .base file with sortable paper metadata.
Per-paper note: title, authors, year, DOI, Zotero key, tags, status, cluster, and verification metadata.
Crystals are plain Markdown notes under hub/<cluster>/crystals/*.md, so they can be linked, searched, and read by MCP tools at very low token cost.
Inside Zotero
Every ingested paper gets a namespaced tag set so you can filter your library by research-hub context:
| Tag | Meaning |
|---|---|
research-hub |
Ingested through this pipeline |
cluster/<slug> |
Which research cluster the paper belongs to |
category/<arxiv-code> |
arXiv category like cs.AI or econ.GN |
type/<publication-type> |
Review, JournalArticle, etc. from Semantic Scholar |
src/<backend> |
Search backend that discovered it: arxiv, semantic_scholar, crossref, zotero |
Every paper can also get a child note with Summary / Key Findings / Methodology / Relevance, derived from the Obsidian frontmatter. Papers that were in Zotero before research-hub existed can be backfilled with:
research-hub zotero backfill --tags --notes --apply
Feature matrix
| Capability | Command or MCP tool | Notes |
|---|---|---|
| One-shot setup | research-hub setup |
init + install + optional NotebookLM login + guided sample run |
| Lazy research pipeline | research-hub auto "topic" / auto_research_topic |
Search, ingest, bundle, upload, generate, download |
| Plan before running | research-hub plan "intent" / plan_research_workflow |
Suggests field, cluster slug, and max papers |
| Zotero hygiene | research-hub zotero backfill --tags --notes [--apply] |
Fills missing tags and notes on legacy items |
| Cluster cascade delete | research-hub clusters delete <slug> [--apply --force] |
Preview impact on Obsidian, Zotero, dedup, memory, and crystals |
| No-NotebookLM smoke test | research-hub auto "topic" --no-nlm |
Validates search and vault ingest without browser automation |
| Local file ingest | research-hub import-folder <folder> --cluster <slug> |
PDF, DOCX, MD, TXT, URL |
| Ad-hoc cluster Q&A | research-hub ask <cluster> "question" / ask_cluster_notebooklm |
Top-level CLI takes cluster first, then question |
| NotebookLM operations | research-hub notebooklm upload --cluster <slug> |
Browser automation with persistent Chrome |
| Pre-computed crystals | research-hub crystal emit --cluster <slug> |
Canonical answers cached as Markdown |
| Structured memory | research-hub memory emit --cluster <slug> |
Entities, claims, methods |
| Live dashboard | research-hub serve --dashboard |
HTTP dashboard with action buttons |
| Sample preview | research-hub dashboard --sample |
Temporary bundled vault, no accounts |
| Lazy maintenance | research-hub tidy |
Doctor, dedup, bases refresh, cleanup preview |
| Garbage collection | research-hub cleanup --all --apply |
Bundles, debug logs, stale artifacts |
| Cluster repair | research-hub clusters rebind --emit then --apply |
Rebinds orphaned notes |
| Obsidian Bases | research-hub bases emit --cluster <slug> |
Generated .base dashboard |
| Web search | research-hub websearch "query" / web_search |
Tavily, Brave, Google CSE, DDG fallback |
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
research-hub init reports Chrome warnings |
Chrome is missing or patchright cannot find it | Install Chrome, then run research-hub doctor |
research-hub notebooklm login opens a browser but Google blocks login |
New-device or bot challenge | Complete the visible browser sign-in and phone challenge |
research-hub auto finds 0 papers |
Topic too narrow or search backend transient issue | Re-run with --max-papers 20 or rephrase |
| NotebookLM upload or generate fails | NotebookLM UI changed or login expired | Run research-hub notebooklm login; then resume with research-hub notebooklm bundle/upload/generate/download --cluster <slug> |
auto --with-crystals cannot find an LLM CLI |
claude, codex, or gemini is not on PATH |
Install one, or use crystal emit and crystal apply manually |
| Claude Desktop cannot see the MCP server | MCP config is in the wrong file or host was not restarted | Check the host config path and restart Claude Desktop |
init reports Zotero warnings but you do not use Zotero |
Persona expects Zotero | Re-run research-hub setup --persona analyst or --persona internal |
research-hub clusters delete refuses to delete |
Cluster has papers, notes, or Zotero items | Re-run with --apply --force after reviewing the cascade preview |
research-hub auto errors "cluster already has N papers" |
Cluster is non-empty and you ran auto --cluster <slug> without a flag |
Add --append to add more, or --force to overwrite |
Zotero items miss research-hub tags or notes |
Items were created before v0.61 or pipeline failed mid-run | research-hub zotero backfill --tags --notes --apply |
For broader checks, run:
research-hub doctor --autofix
Docs + Status + Dev
Docs: First 10 minutes, lazy mode, dashboard walkthrough, MCP tools, personas, NotebookLM setup, import folder, CLI reference, CHANGELOG.
Status:
- Current docs target: v0.89.0; see CHANGELOG for package history.
- MCP tools: inspect the live list with
python -m research_hub describe --filter mcp_tools. - REST endpoints: 12 at
/api/v1/*. - Bundled skills: inspect the live list with
python -m research_hub describe --filter skills.
Developer setup:
git clone https://github.com/WenyuChiou/research-hub.git
cd research-hub
pip install -e ".[dev,playwright]"
python -m pytest -q
Contributing: CONTRIBUTING.md. Package on PyPI: research-hub-pipeline. CLI entry point: research-hub.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file research_hub_pipeline-0.89.1.tar.gz.
File metadata
- Download URL: research_hub_pipeline-0.89.1.tar.gz
- Upload date:
- Size: 75.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb1307cefea2ad935852781fc0de291d1059fe822860c469c026db180cb2d481
|
|
| MD5 |
c927af8646030436cc7535d8cdb1033d
|
|
| BLAKE2b-256 |
491d4750662fa50bcc5c0fd60dfa0f86c1124f622114b34f6143970d5eddfe4e
|
File details
Details for the file research_hub_pipeline-0.89.1-py3-none-any.whl.
File metadata
- Download URL: research_hub_pipeline-0.89.1-py3-none-any.whl
- Upload date:
- Size: 585.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f613e183c118390c86f43c7ea84fbabb8975f0cf199fa7fbae9ab1b0721f0ec
|
|
| MD5 |
8481954dd4c978476b7ae19615c85311
|
|
| BLAKE2b-256 |
d7ddc5ea6830aba9fbc594f504712fdc45a0fcfa1a272d46f0a8be5a948c3600
|