MCP server for the Genomic Intelligence DNA-analysis API: promoter, splice, enhancer, chromatin, expression, annotation + a composite annotation→expression workflow, over MCP stdio.
Project description
gi-mcp
Status: ALPHA (
0.1.0a10). Provided for development, testing, and internal use. The task set, outputs, and tool surface may change without notice, and availability and results are not guaranteed. Not for production systems or clinical/diagnostic decisions.
An MCP server for the Genomic Intelligence DNA-analysis API. It exposes the six inference tasks (promoter, splice, enhancer, chromatin, expression, annotation), a composite annotation→expression workflow, sequence acquisition (Ensembl + local FASTA), reference resources, and ready-made prompt workflows — all over MCP stdio.
It runs locally as a thin protocol translator: it owns no inference and stores no key, forwarding each request to your Genomic Intelligence backend under your API key.
Install
gi-mcp runs with uv (uvx fetches and runs it
with no manual install or virtualenv):
uvx gi-mcp # recommended (once published to PyPI)
pip install gi-mcp # alternative
Install uv with curl -LsSf https://astral.sh/uv/install.sh | sh (macOS/Linux)
or see the uv docs. For local development from this
monorepo:
cd mcp-server
python3 -m venv .venv && .venv/bin/pip install -e ".[dev]"
Configure your MCP client
Request a partner key from https://genomicintelligence.ai/contact, then add a
server block. In every client the pattern is the same: run uvx gi-mcp with
GI_API_KEY in the environment.
Claude Desktop — edit claude_desktop_config.json (macOS:
~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\), then
fully quit and reopen:
{
"mcpServers": {
"genomic-intelligence": {
"command": "uvx",
"args": ["gi-mcp"],
"env": { "GI_API_KEY": "gi_..." }
}
}
}
Claude Code (CLI):
claude mcp add genomic-intelligence --env GI_API_KEY=gi_... -- uvx gi-mcp
Cursor / Windsurf / Zed use the same JSON shape as Claude Desktop in their
own MCP config files. Your own agent: point any MCP-capable client at the
stdio command uvx gi-mcp with GI_API_KEY set — it speaks standard MCP.
From a local checkout, point command at the venv entry point instead of
uvx:
{ "command": "/abs/path/to/mcp-server/.venv/bin/gi-mcp", "env": { "GI_API_KEY": "gi_..." } }
Notes:
- Verify: your client should list the genomic-intelligence server with
16 tools. Ask "List the available promoter models" → runs
list_models. - Your key, your quota: calls count against your partner key's rate / concurrency caps. The server stores no key and owns no inference.
- Updating / pinning:
uvx --refresh gi-mcppicks up new releases; pin with"args": ["gi-mcp@0.1.0a10"].
The handle pattern (why large sequences don't bloat your context)
Genomic sequences are big — the expression model alone wants 9,198 bp, and promoter accepts up to 500,000. Round-tripping those through the LLM twice (once as a tool result, once as the next tool's argument) is wasteful and blows the context window.
So acquisition and prediction are split:
- An acquisition tool fetches/loads a sequence, stores it server-side, and
returns a short handle (
seq_ab12cd34) plus light metadata — never the bases. - A prediction tool takes that
sequence_ref. The server resolves the bases internally.
fetch_ensembl_sequence(gene="TP53")
→ { ref: "seq_ab12cd34", name: "TP53", length: 19148, preview: "CACC…GGTG" }
predict_promoter(sequence_ref="seq_ab12cd34")
→ { data: { regions: [...] }, meta: {...} } # 19 kb never hit the LLM
Small sequences can still be passed inline via sequence. Every predict tool
accepts sequence xor sequence_ref. The handle lives server-side, so a
remote client (e.g. ChatGPT) that opens a fresh session per tool call still
resolves the fetched sequence across the fetch→predict steps.
Try it
Copy-paste these natural-language prompts into a freshly-connected client to smoke-test the live connection end-to-end — one per inference task plus the composite workflow, ordered from simplest (no inference) to full fetch→predict chains.
| # | Task | Prompt | Expected behavior |
|---|---|---|---|
| 0 | warm-up (no inference) | "What Genomic Intelligence models are available for the expression task?" | Calls list_models; returns the model registry. No inference. |
| 1 | promoter | "Fetch human TP53 from Ensembl and scan it for promoter regions — show me the strongest hits." | Fetch → predict_promoter; ranked promoter windows. |
| 2 | expression | "Fetch HBB prepared for expression and predict its expression in K562 cells." | fetch_gene_for_expression → predict_expression. Cell-type-specific — omit the cell type and it returns invalid_input asking for it. |
| 3 | splice | "Fetch the human HBB gene sequence and predict its splice sites." | Fetch → predict_splice; donor/acceptor sites. |
| 4 | enhancer | "Fetch human GATA1 and predict enhancer regions." | Fetch → predict_enhancer. |
| 5 | chromatin | "Fetch human SOX2 and predict chromatin accessibility." | Fetch → predict_chromatin. |
| 6 | annotation | "Find the genes in chr8:127,680,000–127,800,000." | fetch_region → find_genes; transcript intervals (plus-strand). |
| 7 | composite | "Find the genes in chr8:127,680,000–127,800,000 and predict each one's expression in K562." | find_genes_and_predict_expression — annotation→expression in one call. |
No API key handy? "List the bundled demo sequences, then load the K562
expression one and predict its expression" runs load_demo_sequence → no
Ensembl fetch, no personal quota. The server ships one curated positive control
per task (including annotation_hbb_chr11, a plus-strand HBB locus the
gene-finder recovers centred on the TSS), so every task above is smoke-testable
this way.
Going further — the server also works as an analytical assistant that chains tools and reasons over results:
- "Compare the regulatory landscape of HBB versus HBA1 — which has stronger promoter signal and higher predicted expression in K562?"
- "Run promoter prediction on TP53 with every available model and give me a diff table of where they disagree." (
list_models+ multi-model fan-out) - "Annotate chr8:127,680,000–127,800,000 — if it's slow, hand me a job ID and I'll check back." (async
find_genes→get_job) - "Characterize GATA1: run every applicable task and assemble a one-page report for a wet-lab audience."
What's exposed
Tools
| Group | Tools |
|---|---|
| Acquisition | fetch_ensembl_sequence, fetch_region, fetch_gene_for_expression, load_local_fasta, store_inline_sequence, load_demo_sequence |
| Prediction (sync) | predict_promoter, predict_splice, predict_enhancer, predict_chromatin, predict_expression |
| Prediction (async) | find_genes |
| Workflow | find_genes_and_predict_expression |
| Jobs | get_job, list_jobs |
| Catalog | list_models |
Resources
| URI | Contents |
|---|---|
gi://models |
Full model catalog across all six tasks |
gi://models/{model_id} |
Bio spec for one model |
gi://docs/tasks |
What each task does, sync/async, length bounds |
gi://openapi.json |
Live /v1 OpenAPI contract |
gi://sequences |
Bundled demo references — one positive control per task; load one with load_demo_sequence |
gi://cache/{ref} |
Inspect a stored sequence handle |
gi://jobs/recent |
The caller's recent async jobs |
gi://account |
Configured backend + health + exposed tasks |
Prompts (slash-command workflows)
| Prompt | What it does |
|---|---|
gi-promoter-screen <gene> |
Fetch a gene → scan for promoters → report strong hits |
gi-expression-screen <gene> |
TSS window → predict expression → summarise across tissues |
Roots
load_local_fasta only reads files inside directories the user granted via MCP
roots. Set GI_FASTA_ALLOW_ANY=1 to bypass on hosts that don't implement roots.
Environment
| Var | Default | Purpose |
|---|---|---|
GI_API_KEY |
— (required) | Partner bearer key (gi_…) |
GI_BASE_URL |
https://api.genomicintelligence.ai |
Override for staging / local backend |
GI_ENSEMBL_URL |
https://rest.ensembl.org |
Ensembl REST base |
GI_FASTA_ALLOW_ANY |
unset | 1 to skip roots gating for local FASTA |
GI_ASYNC_TIMEOUT |
240 |
Ceiling (s) a wait=True slow task blocks while streaming progress before returning a timeout error (never a job_id) |
GI_ASYNC_POLL_INTERVAL |
2 |
Poll cadence (s) for the block-and-stream loop |
GI_HTTP_TIMEOUT |
300 |
Per-request read timeout (s) for /v1 calls |
Development & contributing
Setup, conventions, the architecture walkthrough, and the release process live
in CONTRIBUTING.md. Quick start:
.venv/bin/pytest # full suite (mocked; no network)
Release history is in CHANGELOG.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gi_mcp-0.1.0a10.tar.gz.
File metadata
- Download URL: gi_mcp-0.1.0a10.tar.gz
- Upload date:
- Size: 78.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd172134e5426056655d63e405da4639d50129eb7fbbd200c514916480d92733
|
|
| MD5 |
20a2aa53047234d7c08a6420efc35296
|
|
| BLAKE2b-256 |
3bf5d894e8dd92a9bca4fef655a701a8d82ac94574f84467ed30b76c6e0e3635
|
File details
Details for the file gi_mcp-0.1.0a10-py3-none-any.whl.
File metadata
- Download URL: gi_mcp-0.1.0a10-py3-none-any.whl
- Upload date:
- Size: 74.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d395864a2fd00e5c8fd921f4c113e059be5323224e35d214f7c8e013ba5fbab
|
|
| MD5 |
ebc6f6ae2f4c6dcb09091ceb47c32e90
|
|
| BLAKE2b-256 |
2d01f775a173a7a3aea8d901b1ec19da550462fb5dc19e263939c141d8badc15
|