No-filtering, attribute-back analysis of high-dimensional omics: keep every feature, read it with a configurable backbone over multiple arrangements, and attribute the result to named features ('gene chords').
Project description
DeepMapper
No-filtering, attribute-back analysis of high-dimensional omics. DeepMapper keeps every feature (no highly-variable-gene selection, no dimension reduction), reads the full matrix with a configurable backbone over several feature arrangements, and attributes each result back to named features. That surfaces distributed gene chords, sets of genes that separate a cell state only together, which standard pipelines discard. An optional de-novo step recovers transcripts absent from the reference annotation.
Install
pip install pydeepmapper # core engine: run(X, y) + linear/mlp/cnn + attribution
pip install "pydeepmapper[io]" # + 10x / h5ad loaders (scanpy, anndata)
pip install "pydeepmapper[backbones]" # + ResNet / ViT / timm backbones
pip install "pydeepmapper[denovo]" # + de-novo ingestion (biopython; external tools below)
pip install "pydeepmapper[all]" # everything
From a clone:
git clone https://github.com/tansel/deepmapper.git
cd deepmapper
pip install -e ".[dev]"
python -m pytest
Quickstart
import numpy as np
from pydeepmapper.config import DeepMapperConfig, BackboneSpec
from pydeepmapper.runner import run
X = ... # (n_cells, n_genes) expression matrix, UNFILTERED (no HVG, no PCA)
y = ... # integer class labels, shape (n_cells,)
genes = [...] # feature names, len == X.shape[1]
cfg = DeepMapperConfig(n_passes=3, backbone=BackboneSpec(kind="cnn_small"))
findings = run(X, y, cfg, feature_names=genes)
for name, freq, importance in findings.ranking(20):
print(name, round(freq, 3), round(importance, 4)) # the gene chord, ranked
Swap the backbone with BackboneSpec(kind=...): linear, mlp, cnn_small
(default), cnn, resnet18, vit_cct, timm:<name>, or conv_vae. For a fast,
exact linear check:
from pydeepmapper import linear_baseline
clf, scaler, ranking = linear_baseline.fit(X, y, feature_names=genes)
How it works
- Keep every feature. No HVG, no PCA.
- Lay the feature vector out as a small pseudo-image, one pixel per feature, under a seeded permutation.
- Train the backbone, attribute per pixel, project back to per-feature importance.
- Repeat over N arrangements and accumulate into a ranking plus stability statistics.
Documentation
The full docs site is at https://tansel.github.io/deepmapper/ (built with MkDocs Material; the API reference is generated from the docstrings). In the repo:
- User manual is the full guide: configuration, data loading, evaluation, attribution, early stopping, de-novo recovery, and the package layout.
- Reproducibility map maps every figure to its script and dataset.
- Data sources lists the dataset accessions.
- bench/ holds the analysis scripts.
Build the docs locally with pip install -e ".[docs]" then mkdocs serve.
Using DeepMapper with Claude
The repo ships a Claude Code skill at
.claude/skills/deepmapper/SKILL.md. Open the repo in Claude Code and ask, for
example, "analyse this h5ad with DeepMapper and tell me which genes separate the
states". Claude loads your data, runs the linear baseline and the pipeline, and reads
back the ranked gene chord. See docs/claude-skill.md.
Reproducing the analyses
pip install -e ".[all]"
bash scripts/download_data.sh
python bench/ribosomal_validation.py # Fig 1, and so on (see docs/REPRODUCIBILITY.md)
Citation
If you use DeepMapper, please cite the peer-reviewed article (see
CITATION.cff):
Ersavas T., Smith M.A., Mattick J.S. (2024). Novel applications of Convolutional Neural Networks in the age of Transformers. Scientific Reports 14. https://doi.org/10.1038/s41598-024-60709-z
To cite this software release specifically, use the archived version on Zenodo (the concept DOI always resolves to the latest release): https://doi.org/10.5281/zenodo.20967454
License
Licensed under the Apache License, Version 2.0, see LICENSE and
NOTICE. Apache-2.0 is permissive (commercial use, modification, and
redistribution allowed) and adds an explicit patent grant covering the method.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydeepmapper-1.0.2.tar.gz.
File metadata
- Download URL: pydeepmapper-1.0.2.tar.gz
- Upload date:
- Size: 39.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23350fc4c59363e15dab2579593607acc7d396dabf81c2bb7d960e6a637e2730
|
|
| MD5 |
de1c9e168dc3bfe8c64e0e3874c33509
|
|
| BLAKE2b-256 |
5fce8100e7d0e03f4a598784d1804bcc7e00fa22e453f2ce4d48ce75c7b7857b
|
File details
Details for the file pydeepmapper-1.0.2-py3-none-any.whl.
File metadata
- Download URL: pydeepmapper-1.0.2-py3-none-any.whl
- Upload date:
- Size: 39.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62687f26963e933396a780c897bf32eda87c77965ee2dbddec2b5f0643c367b4
|
|
| MD5 |
adabd9525cdba4512bf2ddcb5dd0caf0
|
|
| BLAKE2b-256 |
8ad47b984e18896ad8dac98384319295694b44f9ab2bd7eabe36b7b25acd374d
|