MCP server for NCBI Datasets — search metadata and download genomic data
Project description
ncbi-datasets-mcp
NOTE: This is not affiliated with NCBI or NCBI Datasets, this is a user provided tool.
An MCP server that gives Claude access to NCBI Datasets v2 — discover what data NCBI Datasets offers, search genome assembly metadata, retrieve taxonomy records, and download data packages without leaving your conversation.
Tools
| Tool | Transport | Description |
|---|---|---|
ensure_cli |
— | Install the NCBI CLI tools (run once, or set NCBI_AUTO_INSTALL=true) |
list_data_types |
— | Describe what kinds of data NCBI Datasets provides; optional per-type detail |
genome_summary_by_taxon |
REST | Search genome assemblies by organism name or tax ID |
genome_summary_by_accession |
REST | Fetch assembly metadata for known accessions |
genome_download_by_taxon |
CLI | Download a genome package by taxon |
genome_download_by_accession |
CLI | Download a genome package by accession |
rehydrate_genome_package |
CLI | Fetch sequence files for a dehydrated package |
dataformat_genome_tsv |
CLI | Convert a genome JSONL data report to TSV |
taxonomy_summary |
REST | Get lineage, rank, and names for a taxon |
taxonomy_download |
CLI | Download a taxonomy package |
Discovering available data
Not sure what NCBI Datasets offers? Ask "what kind of data can I get from
datasets?" and the server's list_data_types tool returns a readable catalog of
every data report type — genes, genome assemblies, genome sequences, taxonomy,
viruses, and more — along with which tools retrieve each one. Pass a specific
type (e.g. genome-assembly) for its full field list and schema documentation
link.
Installation
Option 1 — Desktop Extension (recommended for Claude Desktop users)
- Download
ncbi-datasets.mcpbfrom the Releases page. - Double-click the file and click Install in Claude Desktop.
- Optionally enter your NCBI API key and download directory.
The NCBI CLI tools are downloaded automatically on first use (NCBI_AUTO_INSTALL=true is set by default in the extension).
Option 2 — JSON config (Claude Desktop / Claude Code)
Add to claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"ncbi-datasets": {
"command": "uvx",
"args": ["ncbi-datasets-mcp"],
"env": {
"NCBI_API_KEY": "your_key_here",
"NCBI_DOWNLOAD_DIR": "/path/to/downloads",
"NCBI_AUTO_INSTALL": "true"
}
}
}
}
Requires uv (curl -LsSf https://astral.sh/uv/install.sh | sh).
Configuration
| Variable | Default | Description |
|---|---|---|
NCBI_API_KEY |
(none) | NCBI API key — raises rate limit to 10 req/s |
NCBI_DOWNLOAD_DIR |
~/Downloads/ncbi_datasets |
Default download location |
NCBI_AUTO_INSTALL |
false |
Auto-install CLI tools on startup |
NCBI_MAX_RESULTS |
20 |
Cap for summary tool result counts |
NCBI_REQUEST_TIMEOUT |
300 |
Seconds before a download times out |
NCBI_CLI_PATH |
(auto) | Override path to datasets binary |
NCBI_DATAFORMAT_PATH |
(auto) | Override path to dataformat binary |
Development
# Install with dev extras
pip install -e ".[dev]"
# Run unit tests
pytest
# Run all tests including live network calls
pytest -m integration
# Regenerate enums from the current NCBI OpenAPI spec
python scripts/gen_enums.py
# Run the server locally (stdio transport)
ncbi-datasets-mcp
Architecture
src/ncbi_datasets_mcp/
server.py FastMCP app — tool registrations only
config.py Pydantic-settings env config
cli/
locator.py Find datasets/dataformat (config → PATH → cache)
installer.py Download binaries from NCBI FTP
runner.py Async subprocess wrapper
rest/
client.py httpx client for metadata/summary endpoints
domains/
_generated_enums.py Vendored enums from OpenAPI spec
common.py Shared utilities (output dir, filename sanitising)
genome.py Genome CLI arg builders + response shaping
taxonomy.py Taxonomy CLI arg builders
models/
responses.py Shared DownloadResult dataclass
Summary tools (no file I/O) → REST API.
Download and format-conversion tools → NCBI CLI binaries.
Cite
If you use NCBI Datasets in your research, please cite:
NCBI Datasets. National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/datasets/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ncbi_datasets_mcp-0.1.2.tar.gz.
File metadata
- Download URL: ncbi_datasets_mcp-0.1.2.tar.gz
- Upload date:
- Size: 74.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50a125721e6484696972dbf25b7454c7e631a51e0bbcbce5a2aca19d84f28c03
|
|
| MD5 |
b23b1e59818e202590b3d2512a76f06e
|
|
| BLAKE2b-256 |
d57ebaab602938f8373116930ddcd1cd08b3cdd8a39e08b4c52c60d13b415a6d
|
Provenance
The following attestation bundles were made for ncbi_datasets_mcp-0.1.2.tar.gz:
Publisher:
ci.yml on syntheticgio/ncbi-datasets-mcp-server
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ncbi_datasets_mcp-0.1.2.tar.gz -
Subject digest:
50a125721e6484696972dbf25b7454c7e631a51e0bbcbce5a2aca19d84f28c03 - Sigstore transparency entry: 1790895991
- Sigstore integration time:
-
Permalink:
syntheticgio/ncbi-datasets-mcp-server@46919f01fd8b66d5db638d0e6c4724aa7dd44eac -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/syntheticgio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@46919f01fd8b66d5db638d0e6c4724aa7dd44eac -
Trigger Event:
push
-
Statement type:
File details
Details for the file ncbi_datasets_mcp-0.1.2-py3-none-any.whl.
File metadata
- Download URL: ncbi_datasets_mcp-0.1.2-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1c09248c3963b2ddf87bb37397a3cfb00ceae9ac29551706202f21b4b073e1f
|
|
| MD5 |
702f64ccd8e78e0b7cb9f2e9fc16173e
|
|
| BLAKE2b-256 |
5cdd3792149e5ecf7b7222d4805d277ac8348256a94f1954ab671375f1532c0c
|
Provenance
The following attestation bundles were made for ncbi_datasets_mcp-0.1.2-py3-none-any.whl:
Publisher:
ci.yml on syntheticgio/ncbi-datasets-mcp-server
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ncbi_datasets_mcp-0.1.2-py3-none-any.whl -
Subject digest:
a1c09248c3963b2ddf87bb37397a3cfb00ceae9ac29551706202f21b4b073e1f - Sigstore transparency entry: 1790896026
- Sigstore integration time:
-
Permalink:
syntheticgio/ncbi-datasets-mcp-server@46919f01fd8b66d5db638d0e6c4724aa7dd44eac -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/syntheticgio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@46919f01fd8b66d5db638d0e6c4724aa7dd44eac -
Trigger Event:
push
-
Statement type: