Skip to main content

Cofunctional grouping-based feature gene selection for unsupervised scRNA-seq clustering

Project description

GeneClust: cofunctional grouping-based feature gene selection for unsupervised scRNA-seq clustering

GeneClust is a computational feature selection method for scRNA-seq cell clustering. GeneClust groups genes into clusters from which genes are evaluated and selected with the aim of maximizing relevance, minimizing redundancy and preserving complementarity. image

Dependencies

  • numpy>=1.21.5
  • pandas>=1.4.2
  • anndata>=0.8.0
  • setuptools>=59.5.0
  • loguru>=0.6.0
  • sklearn>=0.0
  • scikit-learn>=1.1.1
  • scanpy>=1.9.1
  • scipy>=1.7.3
  • leidenalg>=0.8.9

Installation

  1. PyPI

You can directly install the package from PyPI.

  1. Github

Also, You can download the package from Github and install it locally:

git clone https://github.com/ToryDeng/scGeneClust.git
cd scGeneClust/
python3 setup.py install --user

Two Versions of GeneClust

Version Usage Scenarios
GeneClust-ps 1. Number of cells is small (e.g., several thousand)
2. Cell clustering performance is more important
GeneClust-fast 1. Number of cells is large (e.g., over 50,000)
2. Computational efficiency is more important

Example Code

from scGeneClust.utils import load_PBMC3k
from scGeneClust import scGeneClust

# load the PBMC3k dataset
raw_adata = load_PBMC3k()
# GeneClust-fast
selected_genes = scGeneClust(raw_adata, version='fast', n_gene_clusters=200, random_stat=2022, verbosity=2)
# GeneClust-ps
selected_genes = scGeneClust(raw_adata, version='ps', n_cell_clusters=7, scale=1000, top_percent_relevance=5, random_stat=2022, verbosity=2)

GeneClust expects raw counts. The output is an ndarray of selected features, which can be used in the downstream cell clustering analysis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GeneClust-0.0.1.tar.gz (23.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

GeneClust-0.0.1-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file GeneClust-0.0.1.tar.gz.

File metadata

  • Download URL: GeneClust-0.0.1.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for GeneClust-0.0.1.tar.gz
Algorithm Hash digest
SHA256 19755d635c699c86c16b8b86b0c9633ad589ba0f0389a377d767ec498aad1fcd
MD5 c28f7f999786e1848ac50709de925f6c
BLAKE2b-256 9a988225b4f191735ae093cab6c1d78408938c77f21c5d3e36734385b986a71d

See more details on using hashes here.

File details

Details for the file GeneClust-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: GeneClust-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 26.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for GeneClust-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 236afee59c0bd635bae336fe3033085102283d359872d85b3e8b9cc1e7827b5c
MD5 89a6a9bb9b727183ae45934d61944c2a
BLAKE2b-256 bf045e9786d8a914f2131825f91914020bb39cdb3a0c5d9196d70fa88df07f3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page