Skip to main content

From microscope to manuscript, in one repo. The AI lab for biological researchers.

Project description

vaultlab

"From microscope to manuscript, in one repo."

vaultlab is a research companion for biological scientists. Most AI lab tools take a research question and try to write the paper for you. vaultlab is different โ€” it accompanies you through whatever you're actually doing today: searching literature, analyzing your CODEX run, drafting the methods section, building tomorrow's lab-meeting deck, triaging your inbox for the manuscript-deadline email you've been avoiding. With full context of your work โ€” your knowledge base, your Google Docs, your Outlook calendar โ€” Claude Code becomes a useful colleague instead of a generic chatbot.

Open-source. Local-first. Claude-Code-native. MIT licensed.

๐Ÿšง Alpha software. vaultlab is under active development toward v0.1.0 (target: late May 2026). Expect rough edges. See docs/KNOWN_LIMITATIONS.md.

What it does

๐Ÿ“„ Literature search & citation verification โ€” PubMed, Semantic Scholar, CrossRef, bioRxiv, Springer, Elsevier, paperclip MCP
๐Ÿงฌ Wet-lab data analysis โ€” CODEX, MALDI, Visium, scRNA-seq, H&E, flow cytometry
๐Ÿ“Š Publication-quality figures with corpus-backed recipes (every recipe cites โ‰ฅ3 published examples)
โœ๏ธ Manuscript drafting with NotebookLM-style evidence retrieval โ€” every [N] shows the exact passage on hover
๐ŸŽค Slide decks built from research outputs โ€” journal-club, thesis-committee, conference-talk modes
๐Ÿง  Knowledge base (Obsidian-native) that links it all, queryable via semantic search
๐Ÿ“ฅ Email + calendar context โ€” vaultlab reads your Outlook (Windows) or Gmail to know what's pressing
๐Ÿ“ Google Docs integration โ€” your lab work log + Sheets data + Drive files become first-class context

Companion mode, not autonomous mode

vaultlab is not an autonomous AI scientist. It does not generate experiment ideas in a vacuum, run robots, or submit papers without you. It assumes:

  • You have ideas โ€” vaultlab amplifies them
  • You have context โ€” vaultlab indexes it
  • You make the calls โ€” vaultlab does the rote 60% so you can focus on the insightful 40%
  • You ship the paper โ€” vaultlab drafts, verifies, formats, but the byline is yours alone

The "research companion" framing is intentional. The published-paper-via-AI bans many journals impose? Not our use case. "Here are 23 things vaultlab made my week easier" is.

Install

git clone https://github.com/bobbyni819/vaultlab && cd vaultlab
pip install -e ".[all]"
vaultlab setup            # interactive: API keys, KB path, Obsidian, Google, Outlook

Or, if you only want a piece (citations, lit search, figures):

pip install vaultlab            # core
pip install "vaultlab[research,citations]"   # specific subpackages

5-minute Hello World

vaultlab demo pbmc3k

In ~2 minutes on a laptop, this:

  1. Downloads the 3k PBMC dataset (50 MB)
  2. Runs QC + normalization + Leiden clustering
  3. Auto-annotates clusters via LLM (with hedged voice and quoted evidence)
  4. Renders 3 publication-quality figures
  5. Builds a 5-slide journal-club deck with speaker notes
  6. Auto-writes a KB summary note linking everything

Use cases (real ones, not benchmarks)

These are the workflows vaultlab solves end-to-end:

  • "I have a CODEX run. Get me to a labeled figure." Ingest TIFF โ†’ segment with Cellpose โ†’ cluster โ†’ LLM-annotate โ†’ publication-tight spatial overlay โ†’ caption draft โ†’ KB note.
  • "Draft me a Methods paragraph for the lung paper." Reads project KB โ†’ drafts โ†’ verifies every citation semantically โ†’ flags any HALLUCINATED โ†’ produces a draft you edit, not write from scratch.
  • "Find papers using GPR55 in intestinal epithelium." Multi-source lit search (PubMed + bioRxiv + paperclip MCP) โ†’ smart query expansion โ†’ dedupe โ†’ re-rank โ†’ KB ingest of top 10 โ†’ citation-graph view.
  • "Build me a journal-club deck on Smith et al. 2024." /paper-to-slides 10.1038/... extracts figures from PDF โ†’ composes 12-slide deck โ†’ auto-drafts speaker notes โ†’ exports .pptx.
  • "What's on my calendar this week + which manuscripts are due?" Outlook reads upcoming meetings, Gmail reads journal deadlines, KB cross-references active manuscripts โ†’ integrated daily brief.

See docs/use-cases.md for more (post-v0.1).

Architecture philosophy

vaultlab is a capability layer FOR Claude Code, not a competing harness. Markdown is the user-facing interface; Python is the engine. Slash commands, role prompts, recipes, layouts, and skill definitions are all markdown files Claude Code can read at first repo open.

See docs/architecture.md for the full spec.

The four core commitments

  1. Markdown is the interface; Python is the engine. Slash commands, role prompts, workflow descriptions are markdown.
  2. Anti-laziness on semantic reading. Every LLM call requires quoted evidence. No surface-skim.
  3. Result-oriented agentic loop. User says "draft methods" โ†’ vaultlab plans + verifies + refines internally โ†’ returns finished result.
  4. KB is the smartness. Every analysis writes to KB; every analysis reads from it. The LLM gets smarter project-by-project.

What's unique vs PaperQA / scanpy / FutureHouse / scverse / Aider

vaultlab PaperQA2 scanpy FutureHouse scverse Aider
Wet-lab data analysis โœ“ โœ— โœ“ โœ— โœ“ โœ—
Literature + citation verification โœ“ โœ“ โœ— โœ“ โœ— โœ—
NotebookLM-style evidence retrieval โœ“ partial โœ— โœ— โœ— โœ—
Manuscript drafting โœ“ โœ— โœ— partial โœ— โœ—
Slide deck output โœ“ โœ— โœ— โœ— โœ— โœ—
Calendar / inbox context โœ“ โœ— โœ— โœ— โœ— โœ—
Knowledge base (Obsidian) โœ“ โœ— โœ— โœ— โœ— โœ—
Local-first โœ“ โœ“ โœ“ โœ— โœ“ โœ“
Companion mode (not autonomous) โœ“ partial n/a โœ— n/a โœ“
Claude-Code-native skill bundle โœ“ โœ— โœ— โœ— โœ— partial

No tool does all of these. vaultlab's value is the combination โ€” wet-lab analysis (scverse-grade) + literature verification (PaperQA-grade) + manuscript + slides + life-context (calendar/inbox/docs) wired through Claude Code.

If you only need one piece, those tools are great. If you want a research companion, vaultlab is the only OSS option.

See docs/comparison.md for the full positioning analysis.

Demos

Demo Dataset Time
examples/pbmc3k 3k PBMCs (scRNA-seq) 2 min โ€” Hello World
examples/visium_brain 10x mouse brain Visium 30 min โ€” spatial transcriptomics
examples/codex_hubmap_tonsil HuBMAP tonsil CODEX 30 min โ€” flagship spatial imaging

Documentation

Setup:

Reference:

Privacy & limits:

For contributors:

Citation

See CITATION.cff. Once v0.1.0 ships, the preferred citation is:

@software{ni_vaultlab_2026,
  author = {Ni, Bobby Y.X.},
  title  = {vaultlab: A research companion for biological scientists},
  year   = 2026,
  url    = {https://github.com/bobbyni819/vaultlab},
  version= {0.1.0}
}

Privacy & compliance

vaultlab uses Anthropic's Claude API. Prompt content is sent to Anthropic. vaultlab is NOT HIPAA-compliant. Do NOT use with PHI/PII/IRB-restricted data. See docs/data-privacy.md.

When you opt into Google or Outlook integration, vaultlab also reads:

  • Google Docs / Sheets / Drive content you authorize
  • Gmail messages matching your search criteria
  • Outlook calendar events + email subjects/bodies

This data may end up in prompts sent to Anthropic. Do not enable Google/Outlook integration if your account contains PHI or institution-restricted data. Each integration has its own scopes you can audit; see docs/data-privacy.md.

By using vaultlab, you take full responsibility for compliance with your institutional, IRB, IACUC, and regulatory obligations.

Author

Bobby Y.X. Ni โ€” Hickey Lab, Duke University Biomedical Engineering.

License

MIT โ€” anyone can use, modify, distribute, including commercial.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vaultlab-0.0.1.tar.gz (107.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vaultlab-0.0.1-py3-none-any.whl (83.7 kB view details)

Uploaded Python 3

File details

Details for the file vaultlab-0.0.1.tar.gz.

File metadata

  • Download URL: vaultlab-0.0.1.tar.gz
  • Upload date:
  • Size: 107.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vaultlab-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e19446fedbc8403c772069b9d8c19972ea75a70874e15ae9a7c25b89634d7a00
MD5 6ac07db3a2b75f3da7b41425b2a59460
BLAKE2b-256 61b395d50a021c393cd8d7bd18134ed2e2af34f05ed09d78d317e7965bd84596

See more details on using hashes here.

Provenance

The following attestation bundles were made for vaultlab-0.0.1.tar.gz:

Publisher: release.yml on bobbyni819/vaultlab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vaultlab-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: vaultlab-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 83.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vaultlab-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fe5af01e3047dc2f96c51e9d7e02a4fc67d1ab11178f428971759f4f801c4d3f
MD5 3e6bf15d684904b4bf49c3973194324a
BLAKE2b-256 f75bd97a7040cd5964d9b48c7a1233793b3712a706aaa149dff274cc44cf2a14

See more details on using hashes here.

Provenance

The following attestation bundles were made for vaultlab-0.0.1-py3-none-any.whl:

Publisher: release.yml on bobbyni819/vaultlab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page