Accurate and fast cell marker gene identification with COSG
Project description
Accurate and fast cell marker gene identification with COSG
Overview
COSG is a cosine similarity-based method for more accurate and scalable marker gene identification.
COSG is a general method for cell marker gene identification across different data modalities, e.g., scRNA-seq, scATAC-seq, and spatially resolved transcriptome data.
Marker genes or genomic regions identified by COSG are more indicative and with greater cell-type specificity.
COSG is ultrafast for large-scale datasets and is capable of identifying marker genes for one million cells in less than two minutes.
The method and benchmarking results are described in Dai et al. (2022).
Additionally, the R version of COSG is available here.
Note I: we released our Python toolkit, PIASO, in which some methods were built upon COSG.
Note II: we have also recently released PIASOmarkerDB for beta testing.
Note III: COSG is also available for online analysis via Galaxy platform.
Documentation
Installation
Stable version (PyPI):
pip install cosg
Stable version (bioconda):
conda install -c conda-forge -c bioconda cosg
Development version:
pip install git+https://github.com/genecell/COSG.git
Release notes
Release v1.0.3 (March 11, 2025)
Fixed the incompatibility with multiple index columns of adata.uns['cosg']['COSG'] in adata.write function
Enhanced plotMarkerDendrogram function with several new capabilities:
Implemented support for customized cell type-gene pairs
Added color control for nodes and edges
Added cell type filtering functionality
Integrated support for curved edges in visualization
Release v1.0.2 (March 5, 2025)
Added plotMarkerDotplot and plotMarkerDendrogram for enhanced marker gene visualization.
Introduced support for batch_key to compute cosine similarities separately across different batches.
Enabled calculation of normalized COSG scores for comparing gene expression specificity across cell types or datasets.
Resolved a SciPy version deprecation issue related to .A attribute usage.
Fixed a DataFrame manipulation warning.
Added verbosity control, allowing users to adjust log output levels.
Release v1.0.1 (June 15, 2021)
First release in PyPI.
Example
Run COSG:
import cosg
n_genes=30
groupby='CellTypes'
cosg.cosg(
adata,
key_added='cosg',
# use_raw=False, layer='log1p', ## e.g., if you want to use the log1p layer in adata
mu=100,
expressed_pct=0.1,
remove_lowly_expressed=True,
n_genes_user=n_genes,
groupby=groupby
)
Draw the dot plot:
cosg.plotMarkerDotplot(
adata,
groupby=groupby,
top_n_genes=3,
key_cosg='cosg',
use_rep='X_pca', ## Change use_rep to the cell embeddings key you'd like to use
swap_axes=False,
standard_scale='var',
cmap='Spectral_r',
# save='test.pdf'
)
Output the marker list as pandas dataframe:
marker_gene=pd.DataFrame(adata.uns['cosg']['names'])
marker_gene.head()
You could also check the COSG scores:
marker_gene_scores=pd.DataFrame(adata.uns['cosg']['scores'])
marker_gene_scores.head()
Question
For questions about the code and tutorial, please contact Min Dai, dai@broadinstitute.org.
Citation
If COSG is useful for your research, please consider citing Dai et al. (2022).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cosg-1.0.4.tar.gz.
File metadata
- Download URL: cosg-1.0.4.tar.gz
- Upload date:
- Size: 39.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a86ee452d4838191fdedd236713a3949e0fbe2380e946c117478ae5319b691c5
|
|
| MD5 |
46713ac99762dcf8991f36d50ace2bdd
|
|
| BLAKE2b-256 |
ccd3236b426ccddfd5e53b239e0df7391c9379d82fc19de4bb6659837ce2a20d
|
File details
Details for the file cosg-1.0.4-py3-none-any.whl.
File metadata
- Download URL: cosg-1.0.4-py3-none-any.whl
- Upload date:
- Size: 37.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ce8cb5a094776b5e06cd9f050ef02b88cfaa14ffceae3972670a493d42a0049
|
|
| MD5 |
ce2446b168fe9e93d37ccbec348ec23e
|
|
| BLAKE2b-256 |
a7cb479f65ea03827979038f1c74e943a0f2c03d5ea4d8b2f02b3375db07de5f
|