Skip to main content

A Model Context Protocol (MCP) server for accessing GEO (Gene Expression Omnibus) data through NCBI E-Utils API

Project description

Gene Expression Omnibus (GEO) MCP

GEO Logo
Gene Expression Omnibus (GEO) - A public functional genomics data repository

PyPI version Python License MCP GEO

⚠️ Development Warning ⚠️

This project is currently in active development and is not yet production-ready.

  • The MCP stdio server is functional but may have bugs or incomplete features
  • The HTTP server is in early development phase and has not been properly tested
  • API endpoints and functionality may change without notice
  • Use at your own risk and report any issues you encounter

We recommend testing thoroughly in a development environment before using in production.

A Model Context Protocol (MCP) server for accessing GEO (Gene Expression Omnibus) data through NCBI E-Utils API. The tool will enable you to search for GEO datasets, series, samples, platforms, and profiles for your LLM. Tested with Claude Desktop, chatGPT has no out of the box support for this tool yet. Claude will automatically use the tools if it fits the context.

Quick Install (pip)

install from pip

pip install geo-bio-mcp

install from source

# Clone the repo (if not already)
git clone <repo-url>
cd GEO_MCP
pip install -e .

Configuration

Run init to create a config file

geo-bio-mcp --init

This will create a config file at ~/.geo-bio-mcp/config.json (auto-created on first run if missing). This file will contain the following:

{
    "base_url": "https://eutils.ncbi.nlm.nih.gov/entrez/eutils",
    "email": "your_email@example.com",
    "api_key": "YOUR_API_KEY"
}

It will also print out a configuration template for your claude desktop configuration file.

  • email is required by NCBI.
  • api_key is optional but recommended for higher rate limits (get one here).

Running the Server

  • MCP stdio mode:
    geo-bio-mcp
    
  • HTTP mode:
    geo-bio-mcp --http --port 8001
    

Claude Desktop Integration

Common Issue: spawn geo-bio-mcp ENOENT

This error means Claude Desktop cannot find the geo-bio-mcp command. This is usually a PATH issue.

Solution

  1. Find the full path to the executable:

    which geo-bio-mcp
    

    Example output: /Users/youruser/miniforge3/bin/geo-bio-mcp

  2. Update your Claude config: Instead of just "geo-bio-mcp", use the full path:

    {
      "mcpServers": {
        "geo-mcp": {
          "command": "/Users/youruser/miniforge3/bin/geo-bio-mcp",
          "env": {
            "CONFIG_PATH": "/Users/youruser/.geo-mcp/config.json"
          }
        }
      }
    }
    
  3. (Optional) Use a Conda Environment:

    • Activate your conda env and run which geo-bio-mcp to get the correct path.
    • Use that path in your Claude config as above.
  4. Restart Claude Desktop after updating the config.


Troubleshooting

  • If you see command not found: geo-bio-mcp, make sure you installed with the correct Python/conda environment and that its bin directory is in your PATH.
  • If the config file is missing, it will be auto-created on first run, or you can copy the template from geo_mcp_server/config_template.json.

Usage

MCP Server (stdio mode)

For use with MCP clients like Claude Desktop:

cd geo_mcp_server
python main.py --mode stdio

HTTP Server (localhost:8000)

For HTTP API access:

# Option 1: Using the main script
cd geo_mcp_server
python main.py --mode http --host localhost --port 8000

# Option 2: Using the convenience script
python run_server.py

HTTP API Endpoints

Once the HTTP server is running on localhost:8000, you can access:

  • Root: GET / - Server status
  • Health Check: GET /health - Health check endpoint
  • List Tools: GET /tools - List all available tools
  • Call Tool: POST /tools/call - Execute a tool with arguments
  • API Documentation: GET /docs - Interactive API documentation (Swagger UI)

Example HTTP API Usage

# List available tools
curl http://localhost:8000/tools

# Search GEO Profiles
curl -X POST http://localhost:8000/tools/call \
  -H "Content-Type: application/json" \
  -d '{
    "name": "search_geo_profiles",
    "arguments": {
      "term": "cancer",
      "retmax": 5
    }
  }'

# Search GEO Datasets
curl -X POST http://localhost:8000/tools/call \
  -H "Content-Type: application/json" \
  -d '{
    "name": "search_geo_datasets", 
    "arguments": {
      "term": "breast cancer",
      "retmax": 10
    }
  }'

Available Tools

This MCP server provides access to all major GEO databases through the following tools:

Search Tools

  • search_geo_profiles - Search gene expression profiles across different biological contexts

    • Example searches: "cancer", "breast cancer", "p53", "apoptosis"
  • search_geo_datasets - Search curated gene expression datasets

    • Example searches: "diabetes", "Alzheimer's disease", "drug response", "tissue specific"
  • search_geo_series - Search original submitter-supplied gene expression series

    • Example searches: "GSE12345", "microarray", "RNA-seq", "time course"
  • search_geo_samples - Search individual gene expression samples

    • Example searches: "GSM123456", "human", "mouse", "tumor", "normal"
  • search_geo_platforms - Search microarray platform definitions

    • Example searches: "Affymetrix", "Illumina", "Agilent", "GPL96"

Search Parameters

Each tool accepts:

  • term (required): Search term or query string
  • retmax (optional, default: 20): Maximum number of results to return

Example usage

For now just chat with claude desktop and it will use the tools automatically if it fits the context.

User: find a small brain dataset

Assistant: I'll help you find a small brain dataset from the GEO database. Let me search for brain-related datasets with a maximum of one result.

Assistant (system): search_geo_datasets

Assistant: I found a brain-related dataset for you:

  • Dataset: GSE290848
  • Title: Maternal Stress Triggers Early-Life Eczema via Fetal Mast Cell Reprogramming [10x Genomics Multiome ATAC + Gene Expression]
  • Species: Mouse (Mus musculus)
  • Samples: 8
  • Data type: ATAC-seq & gene expression (10x Genomics Multiome)
  • Publication date: 21 Jun 2025

This small dataset explores how prenatal stress affects neuro-immune circuits and mast-cell development.

Would you like me to download this dataset or search for others?


User: download GSE290848

Assistant: I'll download the GSE290848 dataset for you.

Assistant (system): download_geo_data

Assistant: Download complete.

Downloaded files

  • GSE290848_family.soft.gz — series, platform & sample info
  • GSE290848_metadata.xml — detailed study metadata

Location: /Users/matthiasflo/Documents/2025/GEO_MCP/geo_mcp_server/downloads/gse/GSE290848
Total size: 0.01 MB

The dataset is ready for analysis. Let me know if you’d like to explore it further.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geo_mcp-0.1.0.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geo_mcp-0.1.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file geo_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: geo_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for geo_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6a51da66a330b3ec98b970ce557b7b88e4d0d9b7ece2b4dec8ab154f24fd4cc8
MD5 38c2dbf321020a7c08b18345a5b9cdc6
BLAKE2b-256 dac30566b790e3e607db74fcc7018487c498fac9ec54b2ef87d5b75ea4d1323e

See more details on using hashes here.

File details

Details for the file geo_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: geo_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for geo_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 31ea7be8c0e9380ebc866d9e766d9e0e2dfda59aadff7823306a71be8ec9414a
MD5 a7626b882f2dd2c78b8452d82bcae5b8
BLAKE2b-256 24b71ace82df4b46d012bf5da1d3e61d3ac31065c3bc98b5656eec2bbe0118e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page