Allows to retrieve information about an AnnData object via MCP
Project description
anndata-mcp
Allows retrieval and lazy access to information from AnnData objects via MCP (Model Context Protocol)
Features
This MCP server provides lazy, memory-efficient access to AnnData files for biomedical analysis agents. Key features include:
- Lazy Loading: Only loads requested data into memory using
anndata.experimental.read_lazy - Comprehensive Exploration: Get summaries, metadata, unique values, and statistics
- Flexible Data Access: Retrieve specific slices of data matrices, embeddings, and metadata
- Dataset2D Support: Handles both pandas DataFrames and AnnData's Dataset2D objects
- Sparse Matrix Support: Efficiently handles sparse expression matrices
Tools Available
Summary Tools
get_anndata_summary- Get high-level overview of AnnData structure
Exploration Tools
get_attribute_info- Get detailed info about specific attributesget_dataframe_info- Get detailed info about dataframe-like attributes (obs, var)get_unique_values- Get unique values from a columnget_column_stats- Get statistics for a columnget_value_counts- Count occurrences of each unique value (e.g., cells per cluster)get_grouped_stats- Calculate statistics grouped by a categorical columnlist_available_keys- List keys in mapping attributes (obsm, varm, layers, etc.)
Data Access Tools
get_obs_data- Get cell/observation metadata slicesget_var_data- Get gene/variable metadata slicesget_X_data- Get expression matrix slicesget_layer_data- Get data from specific layersget_obsm_data- Get multi-dimensional annotations (embeddings like PCA, UMAP)get_varm_data- Get variable annotationsget_uns_data- Get unstructured data
Installation
You need to have Python 3.13 or newer installed on your system. If you don't have Python installed, we recommend installing uv.
Option 1: Use with uvx (Recommended)
uvx anndata_mcp
Option 2: Install via pip
pip install anndata_mcp
Option 3: Install for development
git clone https://github.com/dschaub95/anndata-mcp.git
cd anndata-mcp
# Create virtual environment with Python 3.13
uv venv --python 3.13
# Install dependencies
uv sync
Option 4: Install with examples
To run the examples, install with the examples extra:
pip install "anndata_mcp[examples]"
# or with uv:
uv sync --extra examples
Configuration
Include in your MCP client configuration (e.g., Claude Desktop, Continue, etc.):
{
"mcpServers": {
"anndata-mcp": {
"command": "uvx",
"args": ["anndata_mcp"],
"env": {
"UV_PYTHON": "3.13"
}
}
}
}
Or if installed via pip:
{
"mcpServers": {
"anndata-mcp": {
"command": "python",
"args": ["-m", "anndata_mcp"],
"env": {}
}
}
}
Usage
For LLM Agents
Once configured, LLM agents can use the tools to explore AnnData files:
Agent: "What's in the pbmc3k.h5ad file?"
→ Calls: get_anndata_summary("pbmc3k.h5ad")
Agent: "Show me unique cell types"
→ Calls: get_unique_values("pbmc3k.h5ad", "obs", "cell_type")
Agent: "Get UMAP coordinates for the first 100 cells"
→ Calls: get_obsm_data("pbmc3k.h5ad", "X_umap", row_slice="0:100")
Direct Python Usage
For testing or direct use in Python:
from anndata_mcp.tools import get_anndata_summary
# Call the tool directly
summary = get_anndata_summary("path/to/data.h5ad")
print(f"Dataset has {summary.n_obs} cells and {summary.n_vars} genes")
Examples
See the examples/ directory for detailed usage examples:
example_script.py: Comprehensive Python script showing all tools including:- Dataset exploration and metadata inspection
- Cluster analysis with value counts and grouped statistics
- Expression matrix access and embedding retrieval
- Complete analysis workflows
To run the examples:
# Install with examples extra (includes scanpy for data download)
uv sync --extra examples
# Run the example script (downloads pbmc3k sample data automatically)
uv run examples/example_script.py
The example script demonstrates:
- Dataset summary and structure exploration
- Cell and gene metadata inspection
- Cluster information and statistics
- NEW: Value counts (cells per cluster)
- NEW: Grouped statistics (average genes per cluster)
- Expression matrix sampling
- UMAP/PCA embedding retrieval
- Complete analysis workflows
Use Cases
This MCP server is designed for biomedical analysis agents that need to:
- Explore large single-cell datasets without loading everything into memory
- Query specific metadata (cell types, gene names, QC metrics)
- Extract embeddings for visualization (UMAP, PCA, t-SNE)
- Access expression data for specific genes or cells
- Work with multiple data layers (raw counts, normalized, scaled)
- Handle datasets larger than RAM through lazy loading
Technical Details
Lazy Reading
The server uses anndata.experimental.read_lazy to open files without loading data into memory. Data is only loaded when specifically requested through tools.
Dataset2D Support
AnnData's lazy reading often returns Dataset2D objects instead of pandas DataFrames. This server handles both transparently.
Sparse Matrix Handling
Expression matrices are often sparse. The server can return sparse data in sparse format or automatically densify small slices for easier consumption.
Slicing Syntax
Use string-based slicing: "0:10", "100:200", ":50" for rows and columns.
Getting started
Please refer to the documentation, in particular, the API documentation.
You can also find the project on BioContextAI, the community-hub for biomedical MCP servers: anndata-mcp on BioContextAI.
Contact
If you found a bug, please use the issue tracker.
Citation
t.b.a
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anndata_mcp-0.0.1.tar.gz.
File metadata
- Download URL: anndata_mcp-0.0.1.tar.gz
- Upload date:
- Size: 220.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a532adc72673e7f94b5e6863588fca9c5a71832e131bb1ffda86bd3a4b2c839f
|
|
| MD5 |
5182a8bff431dff861149ee243b09329
|
|
| BLAKE2b-256 |
e34af825628883739f24acfb6bf2cc3c1d48cc5cd2799e7043736b7261a7710d
|
Provenance
The following attestation bundles were made for anndata_mcp-0.0.1.tar.gz:
Publisher:
release.yaml on biocontext-ai/anndata-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
anndata_mcp-0.0.1.tar.gz -
Subject digest:
a532adc72673e7f94b5e6863588fca9c5a71832e131bb1ffda86bd3a4b2c839f - Sigstore transparency entry: 653113641
- Sigstore integration time:
-
Permalink:
biocontext-ai/anndata-mcp@06125ba93ae2d5d09387e4058b67c72b3875205d -
Branch / Tag:
refs/tags/0.0.1 - Owner: https://github.com/biocontext-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@06125ba93ae2d5d09387e4058b67c72b3875205d -
Trigger Event:
release
-
Statement type:
File details
Details for the file anndata_mcp-0.0.1-py3-none-any.whl.
File metadata
- Download URL: anndata_mcp-0.0.1-py3-none-any.whl
- Upload date:
- Size: 21.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
888c4f601f009193ac66df0cd4775076c27b6415f84682fb0bf5bbf79eec4f57
|
|
| MD5 |
3b868b1acb0ec8b6a9e927979a0aa9fa
|
|
| BLAKE2b-256 |
8b7019d6c8338deb57a44c36c4175b5cd74c3ff6b5a0c2b490c0ee57081f146a
|
Provenance
The following attestation bundles were made for anndata_mcp-0.0.1-py3-none-any.whl:
Publisher:
release.yaml on biocontext-ai/anndata-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
anndata_mcp-0.0.1-py3-none-any.whl -
Subject digest:
888c4f601f009193ac66df0cd4775076c27b6415f84682fb0bf5bbf79eec4f57 - Sigstore transparency entry: 653113659
- Sigstore integration time:
-
Permalink:
biocontext-ai/anndata-mcp@06125ba93ae2d5d09387e4058b67c72b3875205d -
Branch / Tag:
refs/tags/0.0.1 - Owner: https://github.com/biocontext-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@06125ba93ae2d5d09387e4058b67c72b3875205d -
Trigger Event:
release
-
Statement type: