Community MCP server for the CZ CELLxGENE Discover Census single-cell atlas. Ontology-aware, provenance-tracked, unaffiliated with CZI.
Project description
cxg-census-mcp
An MCP server that lets LLM agents query the CZ CELLxGENE Discover Census single-cell atlas without lying about it — ontology-aware filters, cost caps, full provenance + attribution on every response. Drop it into Cursor / Claude Desktop / Claude Code and ask questions like "compare immune cell composition of healthy vs COVID-19 human lung" in plain English.
Independent / unaffiliated. Not affiliated with, endorsed by, or sponsored by the Chan Zuckerberg Initiative (CZI), EMBL-EBI, the U.S. Census Bureau, or anyone else. "CELLxGENE" is a CZI mark; references here are descriptive (nominative) use only.
No warranty. MIT-licensed source, "as is". Research/exploration tool — not a clinical or diagnostic instrument. Always verify results before publication. See LICENSE for the full trademark and content attribution notice, and SECURITY.md for the threat model and known-issues policy.
Alpha (v0.1.1).
CHANGELOG.md
Demos
Healthy vs COVID-19 lung, side-by-side. Two parallel queries, the
disease_multi_value_v7 schema-drift rewrite kicks in for the COVID
cohort, attribution from both contributing dataset sets surfaces in the
same chat turn.
https://github.com/user-attachments/assets/c836f225-5075-4643-87aa-70d311bc5fd2
Cell-type composition of human lung in one query. Free-text "lung"
resolved to UBERON:0002048, routed through tissue_general, every CURIE
labeled, all in a single Tier-0 call.
https://github.com/user-attachments/assets/b0e10ca7-e46b-4e5f-ae63-11949d328c4d
(Videos render on GitHub. On PyPI they appear as bare URLs — head to the GitHub README to watch.)
More prompts in docs/example-questions.md.
Architecture at a glance
┌──────────────────────────────────────────────┐
MCP client │ tools/ thin MCP wrappers, no logic │
(Claude, ─► │ │ │
Cursor, │ ▼ │
Code, …) │ planner/ FilterSpec → QueryPlan, │
│ │ cost estimate, tier routing │
│ ▼ │
│ ontology/ OLS4 + hint overlay, │
│ │ CL/UBERON/MONDO expansion │
│ ▼ │
│ execution/ Tier 0 facet counts │
│ │ Tier 1 chunked obs scan │
│ │ Tier 2 expression aggregate │
│ │ Tier 9 refuse → snippet │
│ ▼ │
│ clients/ OLS4 (HTTPS) + Census/SOMA │
│ │
│ caches/ OLS, facet, plan, filter LRU │
│ models/ Response envelope w/ │
│ attribution + provenance │
└──────────────────────────────────────────────┘
│
▼
┌────────────────────────┐
│ EBI OLS4 (ontology) │
│ CZ CELLxGENE Census │
│ (CC BY 4.0 data) │
└────────────────────────┘
Full architecture notes: docs/architecture.md.
Tool reference: docs/tool-reference.md.
Example questions: docs/example-questions.md.
Install
From PyPI (recommended):
uv tool install "cxg-census-mcp[census]"
cxg-census-mcp # speaks MCP over stdio
Or with pip:
pip install "cxg-census-mcp[census]"
Without the [census] extra you get mock mode (deterministic fixtures) —
handy for offline demos and verifying your MCP client config without pulling
tiledbsoma's ~1 GB of native deps.
From source (for development):
git clone https://github.com/MaxMLang/cxg-census-mcp
cd cxg-census-mcp
uv sync --extra dev --extra census
uv run cxg-census-mcp
MCP client config
Cursor (~/.cursor/mcp.json) and Claude Desktop
(~/Library/Application Support/Claude/claude_desktop_config.json on macOS)
both expect the same shape. Cleanest is uvx once installed from PyPI:
{
"mcpServers": {
"cxg-census": {
"command": "/absolute/path/to/uvx",
"args": ["--from", "cxg-census-mcp[census]", "cxg-census-mcp"]
}
}
}
Use the absolute path to
uvx(which uvxfrom your shell). MCP clients spawn the server in a non-interactive subprocess that doesn't source your shell rc, so a bare"uvx"will fail withNo such file or directory.
If you cloned from source instead, point at the checkout:
{
"mcpServers": {
"cxg-census": {
"command": "/absolute/path/to/uv",
"args": ["--directory", "/path/to/cxg-census-mcp", "run", "cxg-census-mcp"]
}
}
}
Claude Code:
claude mcp add cxg-census -- /absolute/path/to/uvx --from "cxg-census-mcp[census]" cxg-census-mcp
Quit + relaunch your client (⌘Q on macOS — closing the window isn't enough) and the server should show up in the MCP panel with 13 tools.
Tools (13 total)
Workflow: census_summary, get_census_versions, count_cells,
list_datasets, gene_coverage, aggregate_expression, preview_obs,
export_snippet, get_server_limits.
Inspection: resolve_term, expand_term, term_definition,
list_available_values.
Plus MCP resources (markdown docs at cxg-census-mcp://docs/{slug}),
prompts (census_workflow, disambiguation), and cooperative
progress / cancellation notifications. Details in
docs/tool-reference.md.
Configuration
All env vars use the CXG_CENSUS_MCP_ prefix. Most useful:
| Variable | Default | Purpose |
|---|---|---|
CXG_CENSUS_MCP_CENSUS_VERSION |
stable |
Census release to pin |
CXG_CENSUS_MCP_CACHE_DIR |
platformdirs default | Disk cache root |
CXG_CENSUS_MCP_MOCK_MODE |
0 |
If 1, never opens a real Census handle |
CXG_CENSUS_MCP_LOG_LEVEL |
WARNING |
stdlib log level |
Full list and validation: src/cxg_census_mcp/config.py.
Development & operations
Quick loop:
make install-all # uv sync --extra dev --extra census
make lint typecheck test # ruff + mypy + pytest (mock mode)
make cov # tests + coverage HTML in ./htmlcov
make audit # pip-audit on locked production deps
Operational tasks (cache pre-warm, schema diff, container build, metrics
dump, plan-cache vacuum, weekly hint/facet refresh) live in the
Makefile
and are documented in
docs/operational-playbook.md.
Documentation index
| Topic | Where |
|---|---|
| System architecture | docs/architecture.md |
| Tool reference | docs/tool-reference.md |
| Example agent questions | docs/example-questions.md |
| Ontology resolution | docs/ontology-resolution.md |
| Schema-drift handling | docs/schema-drift-format.md |
| Census version pinning | docs/version-pinning.md |
| Progress / cancellation | docs/progress-and-cancellation.md |
| Error model | docs/error-model.md |
| Known limitations | docs/limitations.md |
| Ops runbook | docs/operational-playbook.md |
| Changelog | CHANGELOG.md |
License & attribution
Source code: MIT. The MIT license covers only the code in this repository, not the upstream data, ontologies, or third-party trademarks.
- Data. Tool responses are derived (filtered/aggregated) from the
CZ CELLxGENE Discover Census, distributed by the Chan Zuckerberg
Initiative under CC BY 4.0.
Every response carries an
attributionfield; downstream users must preserve attribution and indicate that changes were made. - Ontologies are fetched via EBI Ontology Lookup Service (OLS4) from CL, UBERON, MONDO, EFO, HANCESTRO, and others; each carries its own license.
- Trademarks ("CELLxGENE", "Cursor", "Claude", "Anthropic", "Model Context Protocol", …) belong to their respective owners. Use here is descriptive only and does not imply affiliation.
This project is a client of the CZ CELLxGENE Discover Census; it does not host, mirror, or redistribute Census data.
Full notice in LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cxg_census_mcp-0.1.1.tar.gz.
File metadata
- Download URL: cxg_census_mcp-0.1.1.tar.gz
- Upload date:
- Size: 75.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c7a820afacfc86759d1ca8f1215e5d66381f2cff6269b1a3772b1934275deee
|
|
| MD5 |
815567f9cb21d23b6ec0ed95c996b94e
|
|
| BLAKE2b-256 |
3f1c8d936ac5958f64ba7e0cdff61d4c766618fc5984cf86e37f4a437d7bb35b
|
Provenance
The following attestation bundles were made for cxg_census_mcp-0.1.1.tar.gz:
Publisher:
release.yml on MaxMLang/cxg-census-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cxg_census_mcp-0.1.1.tar.gz -
Subject digest:
6c7a820afacfc86759d1ca8f1215e5d66381f2cff6269b1a3772b1934275deee - Sigstore transparency entry: 1353812922
- Sigstore integration time:
-
Permalink:
MaxMLang/cxg-census-mcp@9cb043406e863e75b331aec22af9fa53cd112c8b -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/MaxMLang
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9cb043406e863e75b331aec22af9fa53cd112c8b -
Trigger Event:
push
-
Statement type:
File details
Details for the file cxg_census_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: cxg_census_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 109.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
378c4f668f77b06e6e65980cf4e198399036391cc84cf42dd32d02109f3ada36
|
|
| MD5 |
4f4a98f2382ca41b125f071cf648e7d4
|
|
| BLAKE2b-256 |
e00be5816f3ac7bf4915a09ceaabb52f7a45b3d32bf4dc7ca2ae87e4287420f8
|
Provenance
The following attestation bundles were made for cxg_census_mcp-0.1.1-py3-none-any.whl:
Publisher:
release.yml on MaxMLang/cxg-census-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cxg_census_mcp-0.1.1-py3-none-any.whl -
Subject digest:
378c4f668f77b06e6e65980cf4e198399036391cc84cf42dd32d02109f3ada36 - Sigstore transparency entry: 1353813069
- Sigstore integration time:
-
Permalink:
MaxMLang/cxg-census-mcp@9cb043406e863e75b331aec22af9fa53cd112c8b -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/MaxMLang
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9cb043406e863e75b331aec22af9fa53cd112c8b -
Trigger Event:
push
-
Statement type: