Skip to main content

AI-powered research mentor that guides you through your research journey.

Project description

Personal Research Mentor

AI-powered research mentor that guides you through your research journey, helping you grow your own skills and judgment. Features a conversational office for open-ended mentoring, a Question Workshop with 10 approaches for developing research questions, and a guided sharing flow for making your research visible. Integrates with academic databases, Retraction Watch, and web search. Runs locally with a browser-based interface and three LLM backends (Claude, local vLLM, or any OpenAI-compatible API).

Developed at Authentic Research Partners by Sergey Samsonau and Olga Vine.


Research is how we advance human knowledge — but getting started is surprisingly hard.

Middle and high school students are curious and capable, yet most have no access to research mentorship. Science fairs scratch the surface, but there's no one to guide a student through forming a real hypothesis, designing a rigorous study, or navigating the iterative mess that actual research is.

College students face a catch-22: labs want prior research experience, but you can't get experience without getting into a lab. Many talented students never break through this gate.

Amateur scientists, citizen scientists, and professionals sometimes want to explore a research question on the side — for intellectual fulfillment, to contribute to a field they care about, or just for fun. But without institutional structure, they don't know where to start or how to stay rigorous.

Personal Research Mentor is an AI research mentor that fills this gap. It doesn't do your research for you — it teaches you how to do research. It asks questions, challenges your thinking, helps you design experiments, and adapts to where you are in the process. Think of it as a patient, always-available advisor who meets you at your level.

Runs locally as a single Python process with SQLite for storage. Three LLM backends: Claude Code CLI (default — flat-fee subscription, no per-call API costs, conversations processed by Anthropic), a local vLLM server (fully offline, nothing leaves your machine, requires GPU), or a remote API (bring your own key).

Free for individual use. Not designed for commercial use. If you want a similar solution for your organization, contact Authentic Research Partners.

Features

Mentoring

  • Adaptive research mentoring — 21-node LangGraph agent that guides students through the research process
  • Question Workshop — 9 pipelines + Hypothesis chat for developing research questions
  • Research sharing — guided publication flow
  • Semantic memory — remembers context across sessions within each project
  • Assessment tracking — project-level and student-level progress evaluation
  • Multiple teaching personas — different mentoring styles
  • Content safety filtering — input and output safety review

Infrastructure and Interface

  • Three LLM backends — Claude Code CLI, local vLLM, remote API
  • Browser-based interface built with React — conversational office, question workshop, research sharing, project management, progress tracking, usage monitoring, and settings
  • vLLM container management — start, stop, restart, status, health checks
  • API usage budget tracking
  • Tool usage tracking with real-time activity reporting
  • CLI — serve, init, doctor, reset, export, migrate, vllm-container

Integration with External Resources

  • Academic search — OpenAlex, Semantic Scholar, PubMed, arXiv, Europe PMC
  • Researcher and institution discovery — OpenAlex, ORCID, ROR
  • Paper fetching — introduction extraction from 8 open-access sources
  • Retraction Watch — local semantic search over retracted papers
  • Web search — Brave Search or Tavily

Installation

Windows

  1. Open PowerShell and run: powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
  2. Close and reopen PowerShell, then: uv tool install research-mentor --python 3.13

Mac / Linux

  1. curl -LsSf https://astral.sh/uv/install.sh | sh
  2. uv tool install research-mentor --python 3.13

See INSTALLATION.md for detailed step-by-step instructions, including uninstall.

Quick Start

research-mentor doctor  # check system dependencies
research-mentor         # start the server (creates database on first run)

To use local vLLM instead of Claude Code CLI (requires Podman + NVIDIA GPU):

research-mentor vllm-container start   # start vLLM in a Podman container
research-mentor serve --backend vllm   # start the app server

vLLM runs in a container (Podman or Docker) — it is not a Python dependency of this package. The container handles all vLLM + CUDA dependencies. See docs/usage.md for details.

API

Once the server is running:

# Health check
curl localhost:8080/api/health

# Talk to the mentor
curl -X POST localhost:8080/api/office \
  -H "Content-Type: application/json" \
  -d '{"message": "How do I design a water filtration experiment?"}'

# Interactive API docs
open http://localhost:8080/api/docs

Configuration

Default config is bundled. Override in ~/.research-mentor/config.toml:

[llm]
backend = "vllm"

[llm.vllm]
# Switch models by changing active_model (profiles defined in bundled config):
#   "gemma3-12b-fp8" — Gemma3-12B, non-thinking (default)
#   "qwen3-8b-fp8"   — Qwen3-8B, thinking (<think> tags)
active_model = "gemma3-12b-fp8"

All options can also be set via CLI flags (research-mentor vllm-server --help) or environment variables (RESEARCH_MENTOR_BACKEND=vllm).

See docs/configuration.md for the full configuration reference.

Security & Privacy

Research Mentor runs as a local server on your machine (localhost only). There is no cloud backend, no user accounts, and no telemetry.

What stays on your machine

  • Your conversations — stored in a local SQLite database (~/.research-mentor/)
  • Uploaded files — stored locally under ~/.research-mentor/artifacts/
  • PDF content — sent only to a local GROBID container for text extraction, never to external services
  • With the vllm backend — all LLM inference happens locally on your GPU. Nothing leaves your machine

What is sent to external services

Data Where Why
Paper titles, DOIs, author names OpenAlex, Semantic Scholar, PubMed, Europe PMC Literature search
Search queries Brave Search or Tavily (if configured) Web search tool
Researcher names ORCID Researcher lookup
Your email (if configured) OpenAlex, Unpaywall Polite API access (higher rate limits)
API keys Respective services (over HTTPS) Authentication

All external requests use HTTPS (encrypted in transit), but the service providers can see your queries. If you search for papers about a sensitive topic, the API providers (OpenAlex, Semantic Scholar, PubMed, Brave, etc.) will see that query and the DOIs/titles you look up. This is the same as using these services directly in a browser — but worth knowing if your research topic is sensitive.

With the claude_cli backend (default), your entire conversation — every message you send and every response the mentor generates — is processed by Anthropic via the Claude Code CLI. This includes your research topic, hypotheses, experimental designs, and any personal details you share. Subject to Anthropic's privacy policy.

With the api backend, the same conversation content is sent to whichever API endpoint you configure (e.g., OpenAI, a hosted model provider). The provider can see everything discussed in the session.

With the vllm backend pointing to a local GPU, LLM inference runs entirely on your machine — no conversation content leaves it. This is the most private option. If you point the vLLM backend at a remote server (e.g., a shared GPU server in your organization), conversation content is sent to that server instead.

In all cases, search tool queries (see above) are still sent to external academic APIs regardless of which LLM backend you use.

How the LLM interacts with external services

The LLM never executes code — its output is either validated through structured JSON parsing or rendered as plain text.

However, the LLM does formulate search queries that are sent to external APIs. When you ask about a research topic, the system searches for relevant papers and web resources on your behalf. The search queries are derived from your conversation — for example, if you're researching water filtration, the system sends queries like "water filtration efficiency" to OpenAlex or Brave Search.

This means your research topic and related terms are sent to external search services as part of normal operation. No system secrets (API keys, configuration, file paths) are ever included in the LLM's context or in search queries. All tool calls are rate-limited and logged.

Every outgoing search query is checked by a two-layer safety reviewer before it leaves your machine:

  1. Pattern-based check (instant) — blocks queries containing emails, phone numbers, file paths, or credentials
  2. AI-based check — catches subtler leaks like a student's name combined with their school

Both layers are on by default. If a query is blocked, the search is skipped and the mentor continues without those results. Power-mode users can adjust these in Settings.

How your data is protected

  • API keys are stored in individual files with owner-only permissions (chmod 600)
  • SQL injection protection — parameterized queries with column allowlists
  • File upload security — filename sanitization, file type whitelist, size limits
  • XML parsing — uses defusedxml to prevent entity expansion attacks
  • Download size caps — PDF downloads (100 MB) and data imports (500 MB decompressed) are capped to prevent memory exhaustion
  • Security headers — Content-Security-Policy, X-Frame-Options, X-Content-Type-Options on all responses
  • Automated security scanning — every commit is checked with bandit (static analysis), pip-audit (known CVEs), and pip-licenses (license compliance)

Not designed for network deployment

Research Mentor binds to localhost and has no authentication. Do not expose it to untrusted networks without adding your own authentication layer.

Contributing

This is an opinionated project. We don't accept unsolicited pull requests — please open an issue or reach out before contributing. See CONTRIBUTING.md for details.

Disclaimer

This is an AI-powered educational tool. Like all AI systems, it can produce inaccurate or misleading information. It provides research guidance but does not replace your own critical thinking and judgment. Always verify AI-generated content with authoritative sources. If you are a minor, please involve a parent or other trusted adult when anything seems questionable, unclear, or unsafe.

License

PolyForm Noncommercial 1.0.0 — free for personal and noncommercial use. Provided as-is, with no warranty or liability. Not designed for commercial use. If you want a similar solution for your organization, contact Authentic Research Partners.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

research_mentor-1.0.2.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

research_mentor-1.0.2-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file research_mentor-1.0.2.tar.gz.

File metadata

  • Download URL: research_mentor-1.0.2.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for research_mentor-1.0.2.tar.gz
Algorithm Hash digest
SHA256 a5e7fc56ec332a86a7a17ccc5acf25dc8c44ff3e1c74b22e9248b7f1b3ffc810
MD5 20acc20426e194fb65ab3e7b77e9159c
BLAKE2b-256 32770156b21903d391a194d0b62abfbd1212ba3ac8a6b967448acd6224d32a54

See more details on using hashes here.

File details

Details for the file research_mentor-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for research_mentor-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d3bc6ab6f322019b1f6822ea9a9f3aa2f12ba698e97a2cf8bd741c2e8e3ac41e
MD5 55224f2426a76a27970c59a03085b072
BLAKE2b-256 31f9a7f6ce778fed14f35feeaa0536437d93bb8d51a7267dad532da4bae2d5a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page