An MCP server wrapping just-prs — PGS Catalog search and polygenic risk score computation
Project description
just-prs-mcp: Polygenic Risk Scores for AI Agents
An MCP server that gives Claude Code, Cursor, Codex, Antigravity, and other AI agents access to the just-prs toolbox — 5,000+ published polygenic scoring models from the PGS Catalog, VCF normalization, PRS computation, population percentiles, absolute-risk estimation, cross-genome comparison, and quality assessment. Ask an agent in plain language; it calls the right tools and explains the results.
Source: github.com/dna-seq/just-prs-mcp. Built on uv + FastMCP.
Coding agents: start with AGENTS.md.
What Can You Ask an Agent To Do?
"Download Anton's public genome, normalize it, and compute a type 2 diabetes PRS."
"Search the PGS Catalog for breast cancer scores and show the best-performing ones."
"Score both Anton's and Livia's genomes for DVT and intelligence, then compare them."
"Compute PRS for this local VCF and explain the percentile and absolute risk."
"List all genomes I've already normalized and score the latest one for longevity."
What is a PRS?
Many traits and common diseases — type 2 diabetes, coronary artery disease, height, longevity — are polygenic: influenced by thousands of small genetic effects rather than one single gene. A Polygenic Risk Score (PRS) adds those effects together and places the result relative to a reference population. It is not a diagnosis, but it can help visualize inherited predisposition and, where enough evidence exists, translate a percentile into an absolute-risk estimate.
What is MCP?
The Model Context Protocol (MCP) is an open standard that lets AI assistants call external tools. This server exposes PRS computation as MCP tools so any compatible agent — Claude Code, Cursor, Codex, Antigravity, or others — can search the PGS Catalog, normalize genomes, compute scores, and interpret results, all from a natural-language conversation. No genomics expertise required to get started; the agent handles the workflow.
Contents
- What Can You Ask an Agent To Do?
- Quickstart
- Using with Claude, Cursor, Codex, Antigravity, or other agents
- Test Genomes (Quick Play)
- Tools
- Prompts and Resources
- Typical Agent Workflow
- Modes
- Configuration
- Methodology
- Research Use Only
- Deployment
- Project Layout
- License
Quickstart
Requires Python >= 3.13 and uv.
# Install
uv sync # deps (incl. dev)
uv sync --extra reference # + pgenlib for reference/pgen tools (Linux/WSL)
# Run
uv run just-prs-mcp stdio # stdio transport (for MCP clients)
uv run just-prs-mcp stdio --mode extended # expose the full tool surface
uv run just-prs-mcp http # HTTP transport (default :3011)
uv run fastmcp dev fastmcp.json # MCP Inspector (interactive UI)
# Test
uv run pytest # all in-memory, no network needed
uv run ruff check . # lint
uv run pyright # type-check
The server boots with no environment configured — no API keys, no cache directory, no database. Every setting is optional.
Using with Claude, Cursor, Codex, Antigravity, or other agents
Published package — no clone needed
The server is on PyPI, so any MCP
client can launch it with uvx — no clone or
install step needed.
Claude Code:
claude mcp add just-prs -- uvx just-prs-mcp@latest stdio # always newest
claude mcp add just-prs -- uvx just-prs-mcp@0.1.2 stdio # pinned (reproducible)
claude mcp list # → just-prs ... ✔ Connected
Cursor (.cursor/mcp.json in your project or user MCP config):
{
"mcpServers": {
"just-prs": {
"command": "uvx",
"args": ["just-prs-mcp@latest", "stdio"],
"env": { "PRS_MCP_MODE": "essentials" }
}
}
}
Codex (~/.codex/config.toml):
[mcp_servers.just-prs]
command = "uvx"
args = ["just-prs-mcp@latest", "stdio"]
Antigravity or another MCP-capable assistant: add the same server command
in its MCP settings — uvx just-prs-mcp@latest stdio.
Version pinning.
uvxcaches the first version it resolves for a bare name, souvx just-prs-mcpkeeps running that cached build. Use@latestto always fetch the newest, or@<version>to pin. A bare name is the worst of both — avoid it.
Use --mode extended (or PRS_MCP_MODE=extended) to expose the full tool
surface including batch downloads, HuggingFace upload, prevalence priors,
multi-method absolute risk, and reference-panel scoring.
From a clone (development)
The repo's .mcp.json launches the working tree via uv run just-prs-mcp stdio.
Codex equivalent:
[mcp_servers.just-prs]
command = "uv"
args = ["run", "just-prs-mcp", "stdio"]
Test Genomes (Quick Play)
Two public whole-genome sequencing (WGS) datasets from the just-dna-lite project are built in, so you (and your AI agent) can try PRS computation without needing your own genomic data:
| Sample | Zenodo | VCF | Size | License | Tool parameter |
|---|---|---|---|---|---|
| Anton Kulaga | 18370498 | antonkulaga.vcf |
~482 MB | CC0 (public domain) | sample="anton" |
| Livia Zaharia | 19487816 | SIMHIFQTILQ.hard-filtered.vcf.gz |
~349 MB | CC-BY-4.0 | sample="livia" |
Just ask your agent:
"Download Anton's sample genome, normalize it, and compute the PRS for type 2 diabetes."
Under the hood, the agent calls download_sample_genome → normalize_vcf →
compute_prs_by_trait → percentile → absolute_risk.
Tools
Essentials (always available)
| Tool | Description |
|---|---|
search_scores |
Search the PGS Catalog by free text |
score_info |
Cleaned metadata for one PGS ID |
best_performance |
Best evaluation metrics (OR / HR / AUROC / C-index) |
search_traits |
REST trait search with synonym retry |
trait_info |
Trait by EFO / MONDO ID + associated PGS IDs |
list_genomes |
Inventory of downloaded and normalized genomes in the cache |
download_sample_genome |
Fetch a public sample WGS VCF from Zenodo (background task) |
normalize_vcf |
VCF → genotype Parquet (background task) |
compute_prs |
Score one VCF against one PGS model |
compute_prs_batch |
Score one VCF against many PGS models (background task) |
compute_prs_by_trait |
Score all PGS models for a trait, auto-save result to disk (background task) |
percentile |
Population percentile (reference panel / theoretical / AUROC fallback) |
absolute_risk |
Absolute disease risk from a PRS z-score + population prevalence |
assess_quality |
Quality label + interpretation (pure logic, no I/O) |
compare_genomes |
Cross-genome comparison from saved compute_prs_by_trait results |
Extended (opt-in via --mode extended)
| Tool | Description |
|---|---|
normalize_array |
23andMe / AncestryDNA → Parquet (background task) |
download_scoring_file |
One harmonized scoring file from EBI FTP |
list_pgs_ids |
All PGS IDs on EBI FTP |
download_all_metadata |
All metadata sheets as Parquet (background task) |
bulk_download_scores |
Many/all scoring files (background task) |
prevalence_info |
Population prevalence priors for a score or trait |
absolute_risk_bundle |
Multi-method absolute-risk estimation |
push_catalog_to_hf |
Upload cleaned catalog to HuggingFace (needs token) |
download_reference_panel |
Fetch 1000G / HGDP+1kGP panel (background task) |
reference_score / reference_score_batch |
Score against a reference panel (needs pgenlib) |
pgen_read_pvar / pgen_read_psam / pgen_score |
PLINK2 binary ops (needs pgenlib) |
File paths: computation tools take local paths (VCF / normalized Parquet /
.pgendir) on the server's filesystem. Over stdio that's your machine. Reference / pgen tools need the optional nativepgenlib(Linux/WSL —uv sync --extra reference); without it they return a clear install hint.
Prompts and Resources
MCP prompts are reusable prompt templates that agents can invoke to structure their interpretation of results:
| Prompt | Description |
|---|---|
compute_prs_for_trait |
Step-by-step workflow: search → normalize → score → interpret |
interpret_prs_result |
Interpret a single PRS result (verdict, key numbers, context, actions) |
interpret_trait_results |
Interpret combined results across multiple models for one trait |
Resource: resource://prs/panels — lists available reference panels, supported
genome builds, and the active cache directory.
Typical Agent Workflow
A full PRS analysis through the MCP server follows this chain:
1. search_traits("venous thromboembolism") → find trait ID (EFO_0001645)
2. download_sample_genome(sample="anton") → download VCF
3. normalize_vcf(vcf_path) → VCF → Parquet
4. compute_prs_by_trait(trait_id, vcf_path) → score all PGS models, auto-save JSON
5. percentile(prs_score, pgs_id) → population percentile + z-score
6. absolute_risk(pgs_id, z_score) → lifetime probability + risk ratio
7. assess_quality(match_rate, auroc, percentile) → quality label
For cross-genome comparison, repeat steps 2–6 for each genome, then:
8. compare_genomes(result_paths=[...]) → ranked comparison across genomes
compute_prs_by_trait auto-saves each result as JSON in the cache directory and
returns the file path in result_path. Pass those paths to compare_genomes to
get per-trait rankings sorted by percentile (high → low, no directionality
judgment — the agent interprets whether high is good or bad for each trait),
percentile spread, model consistency, and the most divergent traits highlighted.
Modes
PRS_MCP_MODE (env) or --mode (CLI), default essentials:
| Mode | What's registered |
|---|---|
essentials |
Catalog lookup + core compute/analyze workflow + genome comparison. Small default tool list = less context pollution for the agent. |
extended |
Everything: batch downloads, HuggingFace upload, prevalence priors, multi-method absolute risk, reference-panel / pgen scoring. |
Configuration
All settings are optional — the server boots with sensible defaults.
See .env.example and
settings.py for the full list.
| Variable | Description |
|---|---|
PRS_MCP_MODE |
essentials (default) or extended |
PRS_MCP_CACHE_DIR |
Root for cached catalog data, scoring files, reference panels, and saved results. Defaults to just-prs's own (PRS_CACHE_DIR / platformdirs). |
PRS_MCP_DEFAULT_GENOME_BUILD |
Default genome build (GRCh38) |
PRS_MCP_DEFAULT_PANEL |
Default reference panel (1000g) |
PRS_MCP_DUCKDB_MEMORY_LIMIT |
DuckDB memory limit for batch scoring (e.g. 8GB) |
PRS_MCP_HF_TOKEN |
HuggingFace token for push_catalog_to_hf (also honors HF_TOKEN) |
PRS_MCP_TRANSPORT |
stdio / http / sse |
PRS_MCP_HOST / PRS_MCP_PORT |
Bind address for HTTP/SSE (default 0.0.0.0:3011) |
PRS_MCP_LOG_LEVEL |
Logging level (info by default) |
Methodology
Percentile estimation
Percentiles are computed by scoring the 1000 Genomes Project phase 3
reference panel (2,504 individuals, 5 superpopulations: AFR, AMR, EAS, EUR, SAS)
on GRCh38 harmonized scoring files from the PGS Catalog. Each individual's PRS
is computed as Σ(effect_weight × dosage) for matched variants, then percentiles
are derived per superpopulation. The user's VCF is scored with the same engine
and placed on this distribution.
Quality scoring
Each PGS model gets a synthetic quality score (0–100) based on four tiers:
- T1a: AUROC / C-index reported (strongest evidence)
- T1b: Beta only (0.95× penalty)
- T2: OR / HR only (0.90× penalty; converted via probit transform)
- T3: No performance metric (0.6× floor)
The score also factors cohort size (log-scaled), model coverage, and a harmonized-score penalty if coordinates were lifted over. Quality labels: High (≥70), Normal (≥50), Moderate (≥30), Low (<30).
Absolute risk
For disease traits, absolute_risk converts a PRS z-score into a concrete
lifetime probability and risk ratio vs the population average, using trait
prevalence and published effect-size data. A risk_ratio of 1.0 means
population-average risk; >1 means elevated; <1 means reduced. When prevalence
data is unavailable, the tool raises an error — the agent should disclose this
explicitly.
Interpreting results
The server's built-in instructions guide connected agents to:
- Present PRS as genetic predisposition, not a measurement of the trait itself.
- Always call
absolute_riskafterpercentilefor disease traits. - Respect trait directionality: higher percentile = more risk for disease traits (bad), more of the trait for positive traits (good), meaningless for neutral traits.
- Flag ancestry mismatches, low coverage, and model disagreement.
- Cite PGS IDs with links to the PGS Catalog.
For a thorough discussion of PRS interpretation, quality methodology, ancestry considerations, and common questions, see the just-prs documentation.
Research Use Only
PRS results from this server are for research and educational purposes only and do not constitute medical advice. Key caveats:
- PRS models are statistical proxies, not causal readouts. Most GWAS variants are tag SNPs in linkage disequilibrium with causal loci, not the causal variants themselves.
- Many published scores have limited validation, narrow ancestry representation, or modest predictive power. Being listed in the PGS Catalog does not mean a score is clinically ready.
- A high PRS shifts estimated risk relative to a reference population, but environment, lifestyle, age, sex, and clinical biomarkers often matter as much as or more than the common-variant signal.
- Low match rates (common with microarray-based consumer tests) mean the score used only a fragment of the model — noisier and less informative.
- Ancestry matters: scores trained in one population often lose accuracy in another due to differing LD patterns and allele frequencies.
A high PRS is not a diagnosis; a low PRS is not a guarantee.
Deployment
- Docker:
docker build -t just-prs-mcp . && docker run -p 3011:3011 just-prs-mcp(defaults to HTTP). - Smithery:
uv sync --extra smithery; entrypoint inpyproject.toml[tool.smithery]+smithery.yaml. - Declarative:
fastmcp.jsondrivesfastmcp run/fastmcp dev.
Project Layout
src/just_prs_mcp/
server.py build_server(), CLI, graceful shutdown, Smithery entrypoint
settings.py pydantic-settings (PRS_MCP_*), safe defaults
client.py shared PRSCatalog / REST-client construction + adapters
models.py Pydantic tool I/O models (+ reused just-prs models)
logging_setup.py stdlib logging → stderr
tools/
catalog.py essentials — PGS Catalog search and lookup
compute.py essentials — normalize, compute, analyze, compare
extended.py extended — batch downloads, HF upload, prevalence, multi-risk
reference.py extended — reference-panel / pgen scoring (pgenlib)
tests/ in-memory client tests (wiring + logic, no network)
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file just_prs_mcp-0.1.3.tar.gz.
File metadata
- Download URL: just_prs_mcp-0.1.3.tar.gz
- Upload date:
- Size: 45.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"26.04","id":"resolute","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0594f901e9b9077c116e95deec5d0af5cdc74f8b3b2e99cdddef23b99bd4df6c
|
|
| MD5 |
3e02e40097227e220b4561ffbc92b790
|
|
| BLAKE2b-256 |
f1c352c0dcd4031180eeb84009bce7ab27b0a6ee77a4bef9a752f62b7f959cdf
|
File details
Details for the file just_prs_mcp-0.1.3-py3-none-any.whl.
File metadata
- Download URL: just_prs_mcp-0.1.3-py3-none-any.whl
- Upload date:
- Size: 52.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"26.04","id":"resolute","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8183c7abfeb652845d54ff008c59a1a43c1bf2b79d3708a05a689cc56cadcb83
|
|
| MD5 |
010512914f044a4d03aef135cb2d1ff8
|
|
| BLAKE2b-256 |
058941ad84de44d4462e15cb49277d78fdc8596ed1ec1586b4c3847bcf9b0f73
|