Multimodal Epigenetic Sequencing Analysis (MESA) is a flexible and sensitive method of capturing and integrating multimodal epigenetic information of cfDNA using a single experimental assay.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

crchen

These details have not been verified by PyPI

Project description

Multimodal Epigenetic Sequencing Analysis (MESA)

MESA is a Python package for sample-level multimodal cfDNA biomarker modeling. It provides a scikit-learn-style API for preprocessing, feature selection, optional redundancy pruning, modality-specific model fitting, stacked multimodal prediction, and cross-validation.

The package supports both classification and regression.

MESA pipeline overview

Installation

pip install mesa-cfdna

For local development:

pip install -e .
pytest -q tests
python scripts/run_smoke_checks.py

What MESA Does

Handles missing-value filtering and imputation
Applies variance filtering and univariate feature selection
Optionally prunes redundant correlated features after the first selector
Uses Boruta for secondary feature selection
Trains single-modality predictors and stacked multimodal models
Evaluates models with built-in cross-validation helpers

Core API

MESA_modality: single-modality pipeline
MESA: multimodal stacking ensemble
MESA_CV: cross-validation wrapper

Default task-aware estimators:

classification: RandomForestClassifier
regression: RandomForestRegressor

predict_proba() and transform_predict_proba() are available only in classification mode.

Pipeline Figures

Overview figure: compact pipeline summary for README or slides
Detailed method figure: expanded schematic with task-aware branches

Regenerate both figures with:

source /data/homezvol0/chaoronc/miniconda3/etc/profile.d/conda.sh
conda activate py313
python scripts/generate_pipeline_figures.py

Quick Start

Classification

from mesa import MESA_modality, MESA, MESA_CV

modality_1 = MESA_modality(
    top_n=50,
    missing=0.2,
    normalization=True,
    redundancy_pruning="score",
    redundancy_threshold=0.95,
    random_state=42,
)

modality_2 = MESA_modality(
    top_n=80,
    missing=0.1,
    redundancy_pruning="model",
    redundancy_threshold=0.95,
    random_state=42,
)

modality_1.fit(X1_train, y_train)
proba_1 = modality_1.transform_predict_proba(X1_test)

mesa = MESA([modality_1, modality_2], random_state=42)
mesa.fit([X1_train, X2_train], y_train)
ensemble_proba = mesa.predict_proba([X1_test, X2_test])

cv_eval = MESA_CV(MESA_modality(top_n=50, random_state=42))
cv_eval.fit(X1_train, y_train)
auc = cv_eval.get_performance()

Regression

from mesa import MESA_modality, MESA, MESA_CV

reg_modality_1 = MESA_modality(
    task="regression",
    top_n=50,
    redundancy_pruning="score",
    redundancy_threshold=0.95,
    random_state=42,
)

reg_modality_2 = MESA_modality(
    task="regression",
    top_n=80,
    random_state=42,
)

reg_modality_1.fit(X1_train, y_train_continuous)
pred_1 = reg_modality_1.transform_predict(X1_test)

reg_mesa = MESA(
    [reg_modality_1, reg_modality_2],
    task="regression",
    random_state=42,
)
reg_mesa.fit([X1_train, X2_train], y_train_continuous)
ensemble_pred = reg_mesa.predict([X1_test, X2_test])

cv_eval = MESA_CV(
    MESA_modality(task="regression", top_n=50, random_state=42),
    task="regression",
)
cv_eval.fit(X1_train, y_train_continuous)
r2 = cv_eval.get_performance()
rmse = cv_eval.get_performance("neg_root_mean_squared_error")

Redundancy Pruning

MESA can prune correlated CpG-like features after the first univariate selector and before Boruta.

redundancy_pruning="score": keep the best feature in each correlated block using task-aware univariate ranking
redundancy_pruning="model": keep the best feature in each correlated block using model-based cross-validated ranking
redundancy_threshold: absolute correlation threshold used to define redundant blocks
redundancy_method: correlation method, e.g. "pearson"

This step is useful when neighboring or highly correlated features carry redundant signal and would otherwise crowd out other informative loci.

Key Parameters

For MESA_modality:

task: "classification" or "regression"
top_n: number of Boruta-selected features to keep
missing: allowed missing fraction per feature
variance_threshold: minimum variance after imputation
normalization: whether to apply Normalizer()
selector: integer or sklearn-compatible univariate selector
predictor: final estimator
boruta_estimator: estimator used inside Boruta

For MESA_CV:

classification default metric: ROC AUC
regression default metric: R²
supported regression metrics: r2, neg_mean_squared_error, neg_root_mean_squared_error, pearson, spearman

Validation Assets

demo.ipynb: original example notebook
pruning_validation_demo.ipynb: pruning-focused synthetic validation
regression_validation_demo.ipynb: regression validation notebook
scripts/run_smoke_checks.py: notebook-free smoke test runner

Development Notes

Use pandas DataFrame inputs when possible so selected feature indices can be mapped back to columns cleanly.
For biological interpretation, validate any pruning or selector change on a subset before large runs; these changes can alter feature rankings and downstream performance.
Human contributor guidance lives in CONTRIBUTING.md.

Citation

If you use MESA in research, cite:

Li, Y., Xu, J., Chen, C. et al. Multimodal epigenetic sequencing analysis (MESA) of cell-free DNA for non-invasive colorectal cancer detection. Genome Medicine 16, 9 (2024). https://doi.org/10.1186/s13073-023-01280-6

License

This repository is distributed under the terms in LICENSE.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

crchen

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.7.1

Mar 11, 2026

0.6.0

Jun 6, 2025

0.5.0

May 24, 2025

0.2.0

May 23, 2025

0.1.2

Mar 19, 2025

0.1.1

Mar 19, 2025

0.1.0

Mar 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mesa_cfdna-0.7.1.tar.gz (16.7 kB view details)

Uploaded Mar 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mesa_cfdna-0.7.1-py3-none-any.whl (13.2 kB view details)

Uploaded Mar 11, 2026 Python 3

File details

Details for the file mesa_cfdna-0.7.1.tar.gz.

File metadata

Download URL: mesa_cfdna-0.7.1.tar.gz
Upload date: Mar 11, 2026
Size: 16.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mesa_cfdna-0.7.1.tar.gz
Algorithm	Hash digest
SHA256	`5523bb42838cff492d71bc7fc56a97b8bc906378cbe4b3af4c0f549e37875105`
MD5	`b46cf138f03ae4f73aa5a8ccffb2b671`
BLAKE2b-256	`7235dd359124fe87a010183a3e89e2438c61fe82e42bc8b47e3773bb7adf8049`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mesa_cfdna-0.7.1.tar.gz:

Publisher: python-publish.yml on ChaorongC/mesa_cfdna

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mesa_cfdna-0.7.1.tar.gz
- Subject digest: 5523bb42838cff492d71bc7fc56a97b8bc906378cbe4b3af4c0f549e37875105
- Sigstore transparency entry: 1078221599
- Sigstore integration time: Mar 11, 2026
Source repository:
- Permalink: ChaorongC/mesa_cfdna@7ebefa0b8553274139ee74a7867e1929ec4d4d9f
- Branch / Tag: refs/tags/v0.7.1
- Owner: https://github.com/ChaorongC
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@7ebefa0b8553274139ee74a7867e1929ec4d4d9f
- Trigger Event: release

File details

Details for the file mesa_cfdna-0.7.1-py3-none-any.whl.

File metadata

Download URL: mesa_cfdna-0.7.1-py3-none-any.whl
Upload date: Mar 11, 2026
Size: 13.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mesa_cfdna-0.7.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f6710b1f5fde8f30f1d9e1145c7a86ea14fe566529c1213fa03b0da566fba0e2`
MD5	`856ad8b364f464237b9626f66485da82`
BLAKE2b-256	`fbdebf6c36b80634184747095bea68a8815aa1efdd6881baf9f0a83bc8849b91`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mesa_cfdna-0.7.1-py3-none-any.whl:

Publisher: python-publish.yml on ChaorongC/mesa_cfdna

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mesa_cfdna-0.7.1-py3-none-any.whl
- Subject digest: f6710b1f5fde8f30f1d9e1145c7a86ea14fe566529c1213fa03b0da566fba0e2
- Sigstore transparency entry: 1078221603
- Sigstore integration time: Mar 11, 2026
Source repository:
- Permalink: ChaorongC/mesa_cfdna@7ebefa0b8553274139ee74a7867e1929ec4d4d9f
- Branch / Tag: refs/tags/v0.7.1
- Owner: https://github.com/ChaorongC
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@7ebefa0b8553274139ee74a7867e1929ec4d4d9f
- Trigger Event: release

mesa-cfdna 0.7.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Multimodal Epigenetic Sequencing Analysis (MESA)

Installation

What MESA Does

Core API

Pipeline Figures

Quick Start

Classification

Regression

Redundancy Pruning

Key Parameters

Validation Assets

Development Notes

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance