Multimodal Epigenetic Sequencing Analysis (MESA) is a flexible and sensitive method of capturing and integrating multimodal epigenetic information of cfDNA using a single experimental assay.
Project description
MESA
Multimodal Epigenetic Sequencing Analysis (MESA) is a flexible and sensitive method of capturing and integrating multimodal epigenetic information of cfDNA using a single experimental assay.
@ Modified by: Chaorong Chen
@ Modified time: 2023-02-11 02:27:49e original MESA paper, please refer to this tutorial: https://rpubs.com/LiYumei/926228.
Dependencies
- Python >=3.6
- deepTools
- bedtools
- DANPOS2
- BSMAP
- UCSC tools
- Python Package
- pandas
- numpy
- scikit-learn = 0.24.2
- joblib
- itertools
- boruta_py
- deep-forest
Installation
Clone the repository with git:
git clone https://github.com/ChaorongC/MESA
cd MESA
Or download the repository with wget:
wget https://github.com/ChaorongC/MESA/archive/refs/heads/main.zip
unzip MESA-main.zip
cd MESA-main
Usage
The Python script MESA.py in the root directory is the main program for MESA.
The function MESA_single() in 'MESA.py' is for analysis on a single type of feature, and the function MESA_integration() is for combining results on different types of features and returning the multimodal prediction result.
Example
Check the Jupyter notebook demo.ipynb for a tutorial on how to run MESA.
Parameters
MESA_single(X,
y,
estimator,
classifiers=[],
cv=5,
random_state=0,
min_feature=10,
n_jobs=-1,
scoring='roc_auc',
boruta_top_n_feature=1000)
X : dataframe of shape (n_features, n_samples)
Input samples. A matrix containing features as rows with samples as columns.
y : array-like of shape (n_samples,)
Target values/labels/stages. Usually, we use 0 and 1 for 'normal/negative' and 'cancer/positive' samples.
estimator : estimator object/model implementing ‘fit’
The object used to fit the data. A model that is used to evaluate feature subsets in each iteration of sequential backward selection.
classifiers : a list of estimator object/model implementing ‘fit’ and 'predict_proba'
The object to use to evalutate on test set at the end. A model used to train on the final selected feature subset then test on the testing set.
cv : int, cross-validation generator or an iterable, default=5
(Adopted from
sklearn.model_selection.cross_val_score) Determines the cross-validation splitting strategy. Possible inputs for cv are: None, to use the default 5-fold cross validation; int, to specify the number of folds in a (Stratified)KFold; CV splitter, An iterable yielding (train, test) splits as arrays of indices.
random_state : int, RandomState instance or None, default=0
Controls the pseudo random number generation for shuffling the data.
__min_feature : int, default=10
The minimal feature size SBS should consider.
n_jobs : int, default=-1
Number of jobs to run in parallel. When evaluating a new feature to add or remove, the cross-validation procedure is parallel over the folds. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.
scoring : str or callable, default='roc_auc'
For SBS process, a str (see scikit-learn model evaluation documentation) or a scorer callable object/function with signature scorer(estimator, X, y) which should return only a single value. Compatible with
sklearn.model_selection.cross_val_score.
boruta_top_n_feature : int, default=1000
Features to select for SBS in the Boruta algorithm. Features are first ranked by Boruta then output for SBS for further selection.
MESA_integration(X_list,
y,
feature_selected,
classifiers)
X : list of dataframes of shape (n_features, n_samples)
Input samples. A matrix containing features as rows with samples as columns.
y : array-like of shape (n_samples,)
Target values/labels/stages. Usually, we use 0 and 1 for 'normal/negative' and 'cancer/positive' samples.
feature_selected : list of tuples (n_samples)
Features selected for each LOO iteration (same order with X)
classifiers : a list of estimator object/model implementing ‘fit’ and 'predict_proba'
The object to use to evalutate on test set at the end.
Authors
- Yumei Li (yumei.li@uci.edu)
- JianFeng Xu (Jianfeng@heliohealth.com)
- Chaorong Chen (chaoronc@uci.edu)
- Wei Li (wei.li@uci.edu)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mesa_cfdna-0.1.2.tar.gz.
File metadata
- Download URL: mesa_cfdna-0.1.2.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f8ddbb9d1e22e4be16669ea8e5a2455ea0f40f71d88cb691fc8a1f9449b7d68
|
|
| MD5 |
82788753cfe947de13e9612b1709ff5f
|
|
| BLAKE2b-256 |
408f5c4aa25803dab16cb0b609dd1894e1d1f5ae52ca03b5a2a0a7c3826489e3
|
Provenance
The following attestation bundles were made for mesa_cfdna-0.1.2.tar.gz:
Publisher:
python-publish.yml on ChaorongC/mesa_cfdna
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mesa_cfdna-0.1.2.tar.gz -
Subject digest:
2f8ddbb9d1e22e4be16669ea8e5a2455ea0f40f71d88cb691fc8a1f9449b7d68 - Sigstore transparency entry: 184720687
- Sigstore integration time:
-
Permalink:
ChaorongC/mesa_cfdna@2f8fa0e78c62c72274d1650d38d68f857ff9875b -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/ChaorongC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@2f8fa0e78c62c72274d1650d38d68f857ff9875b -
Trigger Event:
release
-
Statement type:
File details
Details for the file mesa_cfdna-0.1.2-py3-none-any.whl.
File metadata
- Download URL: mesa_cfdna-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51af83bdebefac19243a0d24fb3f491daace8d7ef9f64a6a5f93fa5e40bc78e1
|
|
| MD5 |
ef8b17f12a513b62770ce4a900d75bbc
|
|
| BLAKE2b-256 |
384a53e6acd2142125477545410f0109592139c02cdeceee4e903509a307708b
|
Provenance
The following attestation bundles were made for mesa_cfdna-0.1.2-py3-none-any.whl:
Publisher:
python-publish.yml on ChaorongC/mesa_cfdna
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mesa_cfdna-0.1.2-py3-none-any.whl -
Subject digest:
51af83bdebefac19243a0d24fb3f491daace8d7ef9f64a6a5f93fa5e40bc78e1 - Sigstore transparency entry: 184720709
- Sigstore integration time:
-
Permalink:
ChaorongC/mesa_cfdna@2f8fa0e78c62c72274d1650d38d68f857ff9875b -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/ChaorongC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@2f8fa0e78c62c72274d1650d38d68f857ff9875b -
Trigger Event:
release
-
Statement type: