Edge-centric heritability mapping via spatial cell-cell communication
Project description
EdgeMap
Edge-centric heritability mapping via spatial cell–cell communication
EdgeMap decomposes trait heritability into cell-intrinsic (node) and cell–cell communication (edge) components using spatial transcriptomics and GWAS summary statistics. The core question is simple: genetic effects may localize not only to cells themselves, but also to the molecular interfaces between neighboring cells.
Existing methods such as S-LDSC, scDRS, and gsMap map genetic risk to individual cells. EdgeMap tests the complementary hypothesis that heritability can also concentrate in spatially structured intercellular signaling.
How it works
- Spatial communication — Build a Gaussian-weighted spatial neighbor graph (
k=6) and compute LR communication intensity per cell using mass-action kinetics with a bottleneck model for multi-subunit complexes. - Node and edge scores — Quantify where expression is spatially concentrated (node) and where communication is spatially concentrated (edge).
- SNP annotation — Map gene-level scores to SNP-level LD scores using gsMap's pre-computed SNP–gene weight matrix.
- S-LDSC regression — Regress GWAS chi-squared statistics on baseline + node + edge annotations to estimate node and edge heritability enrichment.
- Per-pair ranking — If the aggregate edge signal is significant, run conditional S-LDSC for individual LR pairs against baseline + node to rank the channels driving the signal.
Runtime is typically tens of seconds to a few minutes per trait–tissue pair, depending on tissue size, the number of active LR pairs, disk I/O, and hardware.
Installation
git clone https://github.com/cafferychen777/EdgeMap.git
cd EdgeMap
pip install -e .
This installs the core Python dependencies automatically, including numpy, pandas, pyarrow, scipy, anndata, scanpy, and scikit-learn. Requires Python >= 3.10.
Input preparation
1. Spatial transcriptomics data
Provide an AnnData object with a gene expression matrix and spatial coordinates in .obsm["spatial"].
From 10x Space Ranger output:
import scanpy as sc
adata = sc.read_visium("/path/to/spaceranger/outs")
From other platforms (Slide-seq, MERFISH, STARmap, etc.): create an AnnData object with expression in adata.X and coordinates in adata.obsm["spatial"] (shape n_cells x 2).
Requirements:
- Raw counts by default — EdgeMap normalizes and log-transforms the data unless
--preprocessedis set. - Gene filtering is always applied first — genes expressed in fewer than 10 cells are removed before the normalization check.
--preprocessedskips normalization and log1p, but not this filtering step. - Human gene symbols — the bundled LIANA Consensus database uses human symbols. For non-human data, convert genes to human orthologs first.
- For CLI usage, save the AnnData object to
.h5adfirst:adata.write("my_tissue.h5ad")
2. GWAS summary statistics
Provide a tab-separated file with columns SNP, Z, and N — the standard output of ldsc munge_sumstats:
python munge_sumstats.py \
--sumstats raw_gwas.txt \
--out munged_trait \
--merge-alleles w_hm3.snplist
The output munged_trait.sumstats.gz can be passed directly to EdgeMap.
3. gsMap resource directory
EdgeMap requires the pre-computed LD resources from gsMap:
wget https://yanglab.westlake.edu.cn/data/gsMap/gsMap_resource.tar.gz
tar -xzf gsMap_resource.tar.gz
Expected structure after extraction:
gsMap_resource/
├── quick_mode/
│ ├── baseline/
│ │ ├── baseline.{1..22}.l2.ldscore.feather
│ │ └── baseline.{1..22}.l2.M_5_50
│ └── snp_gene_weight_matrix.h5ad
└── LDSC_resource/
└── weights_hm3_no_hla/
└── weights.{1..22}.l2.ldscore.gz
Resource resolution order:
--resource-dir(CLI) orresource_dir=(Python)EDGEMAP_RESOURCE_DIR- Auto-detection at
data/gsMap_resourcerelative to the installed package or source tree
For reproducibility and clarity, passing --resource-dir explicitly is recommended.
Usage
Command line
edgemap \
--st my_tissue.h5ad \
--gwas munged_trait.sumstats.gz \
--gwas-label "Systolic blood pressure" \
--output results/sbp_heart \
--resource-dir /path/to/gsMap_resource
Python API
import scanpy as sc
import edgemap
adata = sc.read_visium("/path/to/spaceranger/outs")
edgemap.run(edgemap.PipelineConfig(
gwas_sumstats="munged_trait.sumstats.gz",
gwas_label="Systolic blood pressure",
output_dir="results/sbp_heart",
resource_dir="/path/to/gsMap_resource",
), adata=adata)
adata.var["node_score"]
adata.var["edge_score"]
adata.uns["edgemap"]
For file-based workflows, pass st_h5ad instead:
results = edgemap.run(edgemap.PipelineConfig(
st_h5ad="my_tissue.h5ad",
gwas_sumstats="munged_trait.sumstats.gz",
gwas_label="Systolic blood pressure",
output_dir="results/sbp_heart",
resource_dir="/path/to/gsMap_resource",
))
Parameters
| CLI | Python | Default | Description |
|---|---|---|---|
--st |
st_h5ad |
(required) | Path to the spatial transcriptomics .h5ad file |
--gwas |
gwas_sumstats |
(required) | Path to munged GWAS summary statistics |
--gwas-label |
gwas_label |
(required) | Human-readable trait label |
--output |
output_dir |
results |
Output directory |
--resource-dir |
resource_dir |
auto-detect | gsMap resource directory |
--k-spatial |
spatial.k_spatial |
6 | Number of spatial neighbors |
--dis-thr |
spatial.dis_thr |
3000 | Distance threshold in the same units as .obsm["spatial"] |
--n-blocks |
regression.n_blocks |
200 | Jackknife blocks for standard errors |
--gene-chunk-size |
score.gene_chunk_size |
auto | Genes per node-score chunk; useful for memory control on large datasets |
--preprocessed |
spatial.preprocessed |
off | Skip normalization/log1p when the input is already preprocessed |
| — | spatial.min_cells_per_gene |
10 | Minimum number of cells required for a gene to be retained before scoring |
Output
All files are written to --output (output_dir in Python).
results.json
Primary summary output. The schema is concise but not minimal; the fields below are the main ones you will usually inspect.
| Field | Meaning |
|---|---|
gwas_label |
Trait label used for the run |
st_data |
Input ST source (.h5ad path or AnnData (in-memory)) |
params.k_spatial, params.dis_thr |
Spatial graph settings |
params.gene_chunk_size_requested, params.gene_chunk_size_resolved |
Requested and effective node-score chunk size |
n_genes |
Number of genes retained after preprocessing |
n_lr_pairs_active |
Number of active LR pairs in this dataset |
node_edge_spearman |
Spearman correlation between node and edge scores |
annotation_diagnostics |
Gene/SNP mapping diagnostics for the annotation-building step |
regression.ell_node |
Node heritability enrichment: tau, se, z, p_twosided, p_onesided |
regression.ell_edge |
Edge heritability enrichment: tau, se, z, p_twosided, p_onesided |
regression.intercept |
S-LDSC intercept |
regression.n_snps, regression.N_bar, regression.M_total |
Regression metadata |
edge_significant |
true if aggregate edge p_onesided < 0.05 |
n_pairs_tested |
Number of LR pairs ranked in conditional S-LDSC (present only when generated) |
total_time_s |
End-to-end runtime |
Interpretation: a significant edge tau means trait-associated variants are enriched near genes whose spatial communication patterns are concentrated, beyond what cell-intrinsic expression specificity explains.
per_pair_sldsc.csv
Generated only when the aggregate edge signal is significant. Each row is one LR pair tested conditionally against baseline + node.
| Column | Meaning |
|---|---|
pair |
LR pair label (for example VEGFA-FLT1) |
tau |
Pair-specific heritability coefficient |
se |
Block-jackknife standard error |
z |
Ranking score (tau / se) |
Use z for ranking, not for calibrated significance testing. Per-pair annotations are extremely sparse, so the normal approximation for z is not reliable here; formal per-pair significance requires empirical calibration.
lr_pair_stats.json
Communication diagnostics for all active LR pairs.
| Field | Meaning |
|---|---|
mean_comm |
Mean communication intensity across cells |
n_active_cells |
Number of cells with nonzero communication |
pair_score |
Spatial specificity score for that LR pair |
Repository scope
This public repository is intentionally the Python package surface of EdgeMap. Large resources, local analyses, manuscript assets, and figure-generation workflows are not part of the tracked public package tree.
Troubleshooting
| Error | Fix |
|---|---|
h5ad must contain .obsm['spatial'] |
Ensure spatial coordinates are present in the AnnData object |
Expression values look pre-processed |
Provide raw counts, or set --preprocessed |
gsMap resource directory not found |
Set EDGEMAP_RESOURCE_DIR or pass --resource-dir |
No per_pair_sldsc.csv in output |
Expected when the aggregate edge signal is not significant |
Citation
If you use EdgeMap, please cite:
Yang C, Zhang X, Chen J. Intercellular communication is a heritable dimension of human tissue architecture. bioRxiv. 2026. doi: 10.64898/2026.03.29.715138.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edgemap-0.1.0.tar.gz.
File metadata
- Download URL: edgemap-0.1.0.tar.gz
- Upload date:
- Size: 38.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
658d95bd20e64bc128b349f4d02e5f541d0895139269f0edfafbf316c8cbba91
|
|
| MD5 |
73cb4be005fd3a8af2819d707bb6ca69
|
|
| BLAKE2b-256 |
8c93b07b1b7caf232658e954b7c981ad2b3ec73ddaeef1fd597dae52cd86500a
|
File details
Details for the file edgemap-0.1.0-py3-none-any.whl.
File metadata
- Download URL: edgemap-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
feb623f1f691f2385e74c364687c14c4ad6c91f641dc2f594953893efdb37e39
|
|
| MD5 |
0f7a55aa4f9667b9ca1f4656b1e74e32
|
|
| BLAKE2b-256 |
b41096c5ec9407dcd0e357b6db6a22c9e6d2774fc125ec0f0f3319f5be27a03b
|