Skip to main content

MCP server for the Genomic Intelligence DNA-analysis API: promoter, splice, enhancer, chromatin, expression, annotation + a composite annotation→expression workflow, over MCP stdio.

Project description

gi-mcp

Status: ALPHA (0.1.0a11). Provided for development, testing, and internal use. The task set, outputs, and tool surface may change without notice, and availability and results are not guaranteed. Not for production systems or clinical/diagnostic decisions.

An MCP server for the Genomic Intelligence DNA-analysis API. It exposes the six inference tasks (promoter, splice, enhancer, chromatin, expression, annotation), a composite annotation→expression workflow, sequence acquisition (Ensembl + local FASTA), reference resources, and ready-made prompt workflows — all over MCP stdio.

It runs locally as a thin protocol translator: it owns no inference and stores no key, forwarding each request to your Genomic Intelligence backend under your API key.

Install

gi-mcp runs with uv (uvx fetches and runs it with no manual install or virtualenv):

uvx gi-mcp                # recommended (once published to PyPI)
pip install gi-mcp        # alternative

Install uv with curl -LsSf https://astral.sh/uv/install.sh | sh (macOS/Linux) or see the uv docs. For local development from this monorepo:

cd mcp-server
python3 -m venv .venv && .venv/bin/pip install -e ".[dev]"

Configure your MCP client

Request a partner key from https://genomicintelligence.ai/contact, then add a server block. In every client the pattern is the same: run uvx gi-mcp with GI_API_KEY in the environment.

Claude Desktop — edit claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\), then fully quit and reopen:

{
  "mcpServers": {
    "genomic-intelligence": {
      "command": "uvx",
      "args": ["gi-mcp"],
      "env": { "GI_API_KEY": "gi_..." }
    }
  }
}

Claude Code (CLI):

claude mcp add genomic-intelligence --env GI_API_KEY=gi_... -- uvx gi-mcp

Cursor / Windsurf / Zed use the same JSON shape as Claude Desktop in their own MCP config files. Your own agent: point any MCP-capable client at the stdio command uvx gi-mcp with GI_API_KEY set — it speaks standard MCP.

From a local checkout, point command at the venv entry point instead of uvx:

{ "command": "/abs/path/to/mcp-server/.venv/bin/gi-mcp", "env": { "GI_API_KEY": "gi_..." } }

Notes:

  • Verify: your client should list the genomic-intelligence server with 16 tools. Ask "List the available promoter models" → runs list_models.
  • Your key, your quota: calls count against your partner key's rate / concurrency caps. The server stores no key and owns no inference.
  • Updating / pinning: uvx --refresh gi-mcp picks up new releases; pin with "args": ["gi-mcp@0.1.0a11"].

The handle pattern (why large sequences don't bloat your context)

Genomic sequences are big — the expression model alone wants 9,198 bp, and promoter accepts up to 500,000. Round-tripping those through the LLM twice (once as a tool result, once as the next tool's argument) is wasteful and blows the context window.

So acquisition and prediction are split:

  1. An acquisition tool fetches/loads a sequence, stores it server-side, and returns a short handle (seq_ab12cd34) plus light metadata — never the bases.
  2. A prediction tool takes that sequence_ref. The server resolves the bases internally.
fetch_ensembl_sequence(gene="TP53")
  → { ref: "seq_ab12cd34", name: "TP53", length: 19148, preview: "CACC…GGTG" }

predict_promoter(sequence_ref="seq_ab12cd34")
  → { data: { regions: [...] }, meta: {...} }     # 19 kb never hit the LLM

Small sequences can still be passed inline via sequence. Every predict tool accepts sequence xor sequence_ref. The handle lives server-side, so a remote client (e.g. ChatGPT) that opens a fresh session per tool call still resolves the fetched sequence across the fetch→predict steps.

Try it

Copy-paste these natural-language prompts into a freshly-connected client to smoke-test the live connection end-to-end — one per inference task plus the composite workflow, ordered from simplest (no inference) to full fetch→predict chains.

# Task Prompt Expected behavior
0 warm-up (no inference) "What Genomic Intelligence models are available for the expression task?" Calls list_models; returns the model registry. No inference.
1 promoter "Fetch human TP53 from Ensembl and scan it for promoter regions — show me the strongest hits." Fetch → predict_promoter; ranked promoter windows.
2 expression "Fetch HBB prepared for expression and predict its expression in K562 cells." fetch_gene_for_expressionpredict_expression. Cell-type-specific — omit the cell type and it returns invalid_input asking for it.
3 splice "Fetch the human HBB gene sequence and predict its splice sites." Fetch → predict_splice; donor/acceptor sites.
4 enhancer "Fetch human GATA1 and predict enhancer regions." Fetch → predict_enhancer.
5 chromatin "Fetch human SOX2 and predict chromatin accessibility." Fetch → predict_chromatin.
6 annotation "Find the genes in chr8:127,680,000–127,800,000." fetch_regionfind_genes; transcript intervals (plus-strand).
7 composite "Find the genes in chr8:127,680,000–127,800,000 and predict each one's expression in K562." find_genes_and_predict_expression — annotation→expression in one call.

No API key handy? "List the bundled demo sequences, then load the K562 expression one and predict its expression" runs load_demo_sequence → no Ensembl fetch, no personal quota. The server ships one curated positive control per task (including annotation_hbb_chr11, a plus-strand HBB locus the gene-finder recovers centred on the TSS), so every task above is smoke-testable this way.

Going further — the server also works as an analytical assistant that chains tools and reasons over results:

  • "Compare the regulatory landscape of HBB versus HBA1 — which has stronger promoter signal and higher predicted expression in K562?"
  • "Run promoter prediction on TP53 with every available model and give me a diff table of where they disagree." (list_models + multi-model fan-out)
  • "Annotate chr8:127,680,000–127,800,000 — if it's slow, hand me a job ID and I'll check back." (async find_genesget_job)
  • "Characterize GATA1: run every applicable task and assemble a one-page report for a wet-lab audience."

What's exposed

Tools

Group Tools
Acquisition fetch_ensembl_sequence, fetch_region, fetch_gene_for_expression, load_local_fasta, store_inline_sequence, load_demo_sequence
Prediction (sync) predict_promoter, predict_splice, predict_enhancer, predict_chromatin, predict_expression
Prediction (async) find_genes
Workflow find_genes_and_predict_expression
Jobs get_job, list_jobs
Catalog list_models

Resources

URI Contents
gi://models Full model catalog across all six tasks
gi://models/{model_id} Bio spec for one model
gi://docs/tasks What each task does, sync/async, length bounds
gi://openapi.json Live /v1 OpenAPI contract
gi://sequences Bundled demo references — one positive control per task; load one with load_demo_sequence
gi://cache/{ref} Inspect a stored sequence handle
gi://jobs/recent The caller's recent async jobs
gi://account Configured backend + health + exposed tasks

Prompts (slash-command workflows)

Prompt What it does
gi-promoter-screen <gene> Fetch a gene → scan for promoters → report strong hits
gi-expression-screen <gene> TSS window → predict expression → summarise across tissues

Roots

load_local_fasta only reads files inside directories the user granted via MCP roots. Set GI_FASTA_ALLOW_ANY=1 to bypass on hosts that don't implement roots.

Environment

Var Default Purpose
GI_API_KEY — (required) Partner bearer key (gi_…)
GI_BASE_URL https://api.genomicintelligence.ai Override for staging / local backend
GI_ENSEMBL_URL https://rest.ensembl.org Ensembl REST base
GI_FASTA_ALLOW_ANY unset 1 to skip roots gating for local FASTA
GI_ASYNC_TIMEOUT 240 Ceiling (s) a wait=True slow task blocks while streaming progress before returning a timeout error (never a job_id)
GI_ASYNC_POLL_INTERVAL 2 Poll cadence (s) for the block-and-stream loop
GI_HTTP_TIMEOUT 300 Per-request read timeout (s) for /v1 calls

Development & contributing

Setup, conventions, the architecture walkthrough, and the release process live in CONTRIBUTING.md. Quick start:

.venv/bin/pytest                 # full suite (mocked; no network)

Release history is in CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gi_mcp-0.1.0a11.tar.gz (78.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gi_mcp-0.1.0a11-py3-none-any.whl (74.7 kB view details)

Uploaded Python 3

File details

Details for the file gi_mcp-0.1.0a11.tar.gz.

File metadata

  • Download URL: gi_mcp-0.1.0a11.tar.gz
  • Upload date:
  • Size: 78.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gi_mcp-0.1.0a11.tar.gz
Algorithm Hash digest
SHA256 8367d4212c03f8c3032fad59d9d156333d13c4604f7a644e4c50ee41e2ef24b8
MD5 a8f37a8b4f6fb29503aa30d5081b5776
BLAKE2b-256 d6ba4daef7bbe9ee7c7b96e8580c2a2e37e536a08b0a37ddd88390dd262b6315

See more details on using hashes here.

File details

Details for the file gi_mcp-0.1.0a11-py3-none-any.whl.

File metadata

  • Download URL: gi_mcp-0.1.0a11-py3-none-any.whl
  • Upload date:
  • Size: 74.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gi_mcp-0.1.0a11-py3-none-any.whl
Algorithm Hash digest
SHA256 ea39e8602ec9bd63f5c5e8c671b8c5c8d91974fd16bc93331ade7cb3e941dd08
MD5 9bb54a1e0bee56e22c0a12d36d5b2874
BLAKE2b-256 4b827c86a39c40bab69c9b43151228bc33de0e593de24d5b2c475426f32db416

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page