Skip to main content

An eXplainable Cell-specific machine learning method to predict clinical Phenotypes using single-cell multi-omics

Project description

PyPI Python Version License Read the documentation at https://pycellphenox.readthedocs.io/ pre-commit Black Visitors

Getting Started...

Here, we introduce CellPhenoX, an eXplainable machine learning method to identify cell-specific phenotypes that influence clinical outcomes for single-cell data. CellPhenoX integrates robust classification models, explainable AI techniques, and a statistical covariate framework to generate interpretable, cell-specific scores that uncover cell populations associated with a clinical phenotype of interest.

Figure 1. CellPhenoX leverages cell neighborhood co-abundance embeddings, Xi , across samples and clinical variable Y as inputs. By applying an adapted SHAP framework for classification models, CellPhenoX generates Interpretable Scores that quantify the contribution of each feature Xi, along with covariates and interaction term Xi, to the prediction of a clinically relevant phenotype Y. The results are visualized at single-cell level, showcasing Interpretable Scores at low-dimensional space, correlated cell type annotations, and associated marker genes.

You can install pyCellPhenoX from PyPI:

pip install pyCellPhenoX

github (link):

# install pyCellPhenoX directly from github
git clone git@github.com:fanzhanglab/pyCellPhenoX.git

Dependencies/ Requirements

When using pyCellPhenoX please ensure you are using the following dependency versions or requirements

python = "^3.9"
pandas = "^2.2.3"
numpy = "^1.26"
xgboost = "^2.1.1"
numba = ">=0.54"
scikit-learn = "^1.5.2"
matplotlib = "^3.9.2"
statsmodels = "^0.14.3"
fasttreeshap = "0.1.6"
shap = "^0.45"
met-brewer = "^1.0.2"

Tutorials

Please see the Command-line Reference for details. Additonally, please see Vignettes on the documentation page.

API

pyCellPhenoX has four major functions which are apart of the object:

  1. split_data() - Split the data into training, testing, and validation sets
  2. model_train_shap_values() - Train the model using nested cross validation strategy and generate shap values for each fold/CV repeat
  3. get_shap_values() - Aggregate SHAP values for each sample
  4. get_intepretable_score() - Calculate the interpretable score based on SHAP values.

Additional major functions associated with pyCellPhenoX are:

  1. marker_discovery() - Identify markers correlated with the discriminatory power of the Interpretable Score.
  2. nonNegativeMatrixFactorization() - Perform non Negative Matrix Factorization (NMF)
  3. preprocessing() - Prepare the data to be in the correct format for CellPhenoX
  4. principleComponentAnalysis() - Perform Principle Component Analysis (PCA)

Each function has uniqure arguments, see our documentation for more information

License

Distributed under the terms of the MIT license, pyCellPhenoX is free and open source software.

Code of Conduct

For more information please see Code of Conduct or Code of Conduct Documentation

Contributing

For more information please see Contributing or Contributing Documentation

Issues

If you encounter any problems, please file an issue along with a detailed description.

Citation

If you have used pyCellPhenoX in your project, please use the citation below:

Young, J., Inamo, J., Caterer, Z., Krishna, R., Zhang, F. CellPhenoX: An eXplainable Cell-specific machine learning method to predict clinical Phenotypes using single-cell multi-omics, bioRxiv 2025.01.24.634132; doi: https://doi.org/10.1101/2025.01.24.634132

Contact

Please contact fanzhanglab@gmail.com for further questions or protential collaborative opportunities!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycellphenox-1.5.tar.gz (94.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycellphenox-1.5-py3-none-any.whl (20.6 kB view details)

Uploaded Python 3

File details

Details for the file pycellphenox-1.5.tar.gz.

File metadata

  • Download URL: pycellphenox-1.5.tar.gz
  • Upload date:
  • Size: 94.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.6 Darwin/22.4.0

File hashes

Hashes for pycellphenox-1.5.tar.gz
Algorithm Hash digest
SHA256 2f7f741d456732aed68ebc4f192cd5529306a5a2e6b125686e5b0132c6e329d9
MD5 6310ee093cd95d09b6734015bd687604
BLAKE2b-256 e96195e5bebb4d001ff6e308d4d69ad497eac0b02585528795efa3b15b670b80

See more details on using hashes here.

File details

Details for the file pycellphenox-1.5-py3-none-any.whl.

File metadata

  • Download URL: pycellphenox-1.5-py3-none-any.whl
  • Upload date:
  • Size: 20.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.6 Darwin/22.4.0

File hashes

Hashes for pycellphenox-1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6c8cf11b88cf7c73aed184c9936d750d6a19b6f4031895539dbd48a07c7723ab
MD5 7f6b5c4ed46185a4855c0e44e76e6bde
BLAKE2b-256 62bb6e54fbd3381726ff7f9d070e1c1b6df0750813fd2ca52d47340c988416b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page