Skip to main content

MCP server for reproducible, grounded, citation-verified research — sub-task pipelines, provenance sidecars, publication-grade visualisation, grounded reasoning, and self-tested dashboards.

Project description

Research OS — grounded · cited · auditable

Research OS

python license version

Turn your AI IDE into a rigorous research collaborator. Drop your data, talk in plain English, get publication-grade outputs with verified citations, full provenance, and audited quality gates — no hallucinated numbers or references leaking into the paper.

You touch inputs/. The AI touches workspace/ and synthesis/. Research OS keeps it honest: every figure has a caption + summary + provenance sidecar; every citation is verified online; every deliverable passes a completeness gate; every gate bypass is logged for your pre-submission audit.

Works with any MCP-speaking AI IDE: Claude Code, Claude Desktop, OpenCode, Antigravity, Cursor, VS Code (MCP), Windsurf, Continue, Aider. Research OS does NOT manage LLM provider keys — your IDE owns model access.


Install — one command

pip install "research-os[all] @ git+https://github.com/VibhavSetlur/Research-OS.git"

The research-os binary is global; install once and it serves every project you scaffold. → Per-IDE wiring & extras


Your first project — 60 seconds

mkdir my-project && cd my-project
research-os init                           # arrow-key wizard
# or: research-os init my-project --yes    # one-shot for CI/scripts

The wizard asks:

  • where the project lives and what it's called,
  • which AI IDEs you use (drops MCP configs in the right places),
  • lets you paste a Slack thread / email / PI message → parsed and saved into inputs/context/ with provenance frontmatter,
  • lets you paste arXiv IDs · DOIs · PDF URLs → downloaded into inputs/literature/ (Unpaywall for DOIs),
  • offers to symlink existing data files into inputs/raw_data/.

Done in under a minute. → Full walkthrough


Day-to-day — drop files, talk

After the wizard, just open the project in your AI IDE. The MCP server auto-launches. Drop data into inputs/raw_data/, PDFs into inputs/literature/, notes into inputs/context/. Then in chat:

> fill out the intake
> what should I do next?
> run a baseline EDA
> compare logistic regression and gradient boosting
> draft the discussion section
> what's left before I can submit?

The AI picks the right workflow from 88 protocols, runs it through 143 MCP tools that enforce real quality gates, and asks you when it's uncertain — including guidance/scope_clarification for open-ended or cross-disciplinary asks the router can't pick from. → Common phrasings → outputs


Your project layout

my-project/
├── inputs/                  ← IMMUTABLE — you drop files here
│   ├── raw_data/                  CSV / parquet / FASTQ / ...
│   ├── literature/                PDFs of papers
│   ├── context/                   notes / drafts / PI messages
│   ├── intake.md                  short pointer; AI rewrites on autofill
│   └── researcher_config.yaml     AI behaviour (every field optional)
│
├── workspace/               ← ACTIVE — AI lives here
│   ├── methods.md · analysis.md · citations.md   (append-only memory)
│   ├── 01_baseline_eda/                          (numbered experiment)
│   │   ├── scripts/  (atomic, versioned _v1 → _v2)
│   │   ├── outputs/{figures,tables,reports}/
│   │   │       each figure ships .caption.md + .summary.md + .prov.json
│   │   ├── pipeline.yaml          (sub-task DAG, content-hash cached)
│   │   ├── conclusions.md
│   │   └── .versions/v<n>/        (snapshots from tool_step_iterate)
│   ├── logs/                      audit reports + override ledger
│   └── scratch/                   AI sandbox (gitignored)
│
├── synthesis/               ← FINAL — only when you ask for a deliverable
│   paper · abstract · poster + QR · dashboard · slides · handout · etc.
│
└── AGENTS.md · CLAUDE.md · .cursor/ · .claude/ · .windsurfrules · ...
                            per-IDE rules + MCP configs (auto-dropped)

Synthesis and the input subfolders are lazy — they materialise on first write, so a fresh project surface stays uncluttered.


What it does

Seven capability groups, 88 protocols, 143 MCP tools.

What you say · what you get
Plan + design EDA + hypothesis generation · power analysis · evaluation design (split / CV / paired test) · hyperparameter sweep design · data ethics review (IRB / privacy / fairness) · preregistration
Run analyses iterative experiments with provenance · head-to-head method comparison · data-quality audit · reproduction of published work · causal / Bayesian / time-series / ML / clinical / qualitative / mixed-methods pipelines · dead-end routing
Build figures full viz workflow · multi-panel composition · figure narrative arc · colour-blind + WCAG accessibility audit · reviewer-style critique
Write per-section drafting (methods · results · discussion · limitations · end-matter with CRediT) · title workshop · cover letter · pre-submission gate
Synthesise IMRAD paper · abstract · poster + QR · talk slides (lab / conference / defense) · self-tested HTML dashboard · printable handout + QR · grant narrative · lay summary · PI progress update · null-findings companion
Read + understand quick paper critique · multi-paper comparative review · methodological consultation (teach me X) · literature search with forward-citation walk · full PRISMA systematic review
Audit + ship quality audit · reproducibility verification · code review · peer-review response · collaboration handoff (share-safe zip) · structured AI pushback when grounded evidence disagrees

docs/USE_CASES.md — role × goal × output map → docs/PROTOCOLS.md — every protocol with trigger phrases and quality bars → docs/TOOLS.md — every MCP tool with example calls


Why use it — pain Research OS resolves

Pain How Research OS resolves it
AI hallucinates citations Synthesis tools verify every cite online; drop the rest. Per-section caps (3 abstract / 12 dashboard / 40 paper).
AI hallucinates numbers tool_audit_claims traces every number in paper.md back to a workspace artefact. BLOCKS synthesis if any are ungrounded.
AI guesses methodology tool_research_method mandates literature grounding before any commit.
AI writes 400-line one-shot scripts A step whose outputs span figures + tables + reports without a pipeline.yaml is BLOCKED. Forces atomic, content-hash-cached sub-tasks.
"Iterate Figure 2's look" → ad-hoc edits tool_step_iterate snapshots scripts + outputs + captions + conclusion as a coordinated .versions/v<n>/ BEFORE you edit; tool_audit_version_coherence flags any output whose .prov.json still points at an older script.
Pre-registered analyses drift tool_preregister_freeze content-hashes the SAP; _diff surfaces every deviation.
Null findings → file drawer synthesis_null_findings publishable companion for refuted / underpowered / abandoned.
Pre-submission anxiety audit/pre_submission_checklist walks every check journals run — including reviewing every gate bypass logged in workspace/logs/override_log.md.
Researcher enters mid-pipeline guidance/mid_pipeline_entry classifies into 7 archetypes; skips redundant intake.
143 tools is too many to triage per turn tool_route returns ~10-15 active tools per protocol; sys_active_tools(protocol_name) returns the same shortlist on demand.
Same project, different AI tomorrow sys_session_handoff snapshots; tool_session_resume reconstructs intent in one call.
Long jobs on shared HPC tool_task_run backgrounds them; tool_slurm_submit for clusters.

docs/RESEARCHER_GUIDE.md § Power-user patterns for the full list.


Tune AI behaviour — inputs/researcher_config.yaml

Every field is optional. The two knobs that change how the AI behaves on edge cases:

interaction:
  # How quality-gate blockers are treated:
  #   enforce        → AI refuses bypass unless researcher explicitly
  #                    authorises AND supplies an override_rationale
  #                    (recorded to workspace/logs/override_log.md).
  #   allow_override → bypass on request; rationale logged if provided.
  #   warn_only      → blockers become warnings (sandbox use only).
  quality_gate_policy: enforce

  # What the AI does when your request is ambiguous:
  #   ask_when_uncertain → default; AI asks a one-line follow-up.
  #   take_best_default  → AI proceeds, surfaces the chosen default
  #                        for review.
  ambiguity_posture: ask_when_uncertain

Every bypass the AI takes is recorded — the pre-submission checklist surfaces them so nothing ships without your sign-off.


Documentation

First time Day-to-day Reference
START.md — install + first project + cheatsheet RESEARCHER_GUIDE.md — full workflow walkthrough PROTOCOLS.md — every protocol
SETUP.md — install + per-IDE wiring USE_CASES.md — role × goal × output map TOOLS.md — every MCP tool
FAQ.md — common questions SHARING.md — share-safe zip + GitHub paths PROTOCOL_DOCTRINE.md — for protocol authors
AI_GUIDE.md — for the AI itself

Doc index → docs/README.md.


Verify your install

python scripts/preflight.py            # 13 wiring checks
pytest -q                              # 417 tests, ~9 s
ruff check src/ tests/ scripts/

Contributing + License

See CONTRIBUTING.md — covers adding a tool, adding or modifying a protocol (must follow the scaffold-not-script doctrine), and the test conventions. Issues + PRs welcome at https://github.com/VibhavSetlur/Research-OS/issues.

License: MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

research_os-1.1.0.tar.gz (680.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

research_os-1.1.0-py3-none-any.whl (683.1 kB view details)

Uploaded Python 3

File details

Details for the file research_os-1.1.0.tar.gz.

File metadata

  • Download URL: research_os-1.1.0.tar.gz
  • Upload date:
  • Size: 680.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for research_os-1.1.0.tar.gz
Algorithm Hash digest
SHA256 2bfaa048266f066fea095ba887ce3badbd88001da5874fa339fc9b11cc680299
MD5 630b55b3944d92e0552c143bddf4eb20
BLAKE2b-256 918d761a7fb8d5f7ab431767f167c7ca38310be50399cb4aec4b462e2213c1b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for research_os-1.1.0.tar.gz:

Publisher: publish.yml on VibhavSetlur/Research-OS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file research_os-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: research_os-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 683.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for research_os-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7020f5c851f0461727ce6a118206d9311c7201f51ac55e2f488080600307a78b
MD5 52a4e396048bc2eab435704eb18d1938
BLAKE2b-256 9f8999c33dc2c35680e32a4803ad578d6a429651fc1ce5b555c947fd4452c8c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for research_os-1.1.0-py3-none-any.whl:

Publisher: publish.yml on VibhavSetlur/Research-OS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page