Pure-Python port of Bioconductor Mfuzz — soft clustering of time-series gene-expression data by fuzzy c-means.
Project description
py-mfuzz
Pure-Python port of the Bioconductor package Mfuzz — soft clustering of time-series gene-expression data by fuzzy c-means (Futschik & Carlisle, J. Bioinform. Comput. Biol. 2005; Kumar & Futschik, Bioinformation 2007).
pymfuzz reproduces the full computational and visualisation API of
Mfuzz with no R dependency — only numpy, scipy, pandas,
matplotlib and anndata. The fuzzy c-means core is a faithful numpy
port of e1071's cmeans C routine, the same algorithm R's mfuzz()
wraps.
Why
Mfuzz operates on a Bioconductor ExpressionSet. pymfuzz instead
accepts a plain genes × timepoints numpy.ndarray,
pandas.DataFrame or anndata.AnnData, and returns numpy / pandas /
dataclasses — drop-in for Python single-cell / bulk pipelines.
Install
pip install pymfuzz
From source:
pip install -e .
Quick start
import pymfuzz as mf
# 1. load a genes x timepoints time-course (Mfuzz's data(yeast))
data = mf.load_yeast()
# 2. preprocessing
data = mf.filter_NA(data, thres=0.25) # drop genes with many NAs
data = mf.fill_NA(data, mode="knn") # impute remaining NAs
data = mf.standardise(data) # per-gene z-score
# 3. estimate the fuzzifier and cluster
m = mf.mestimate(data) # Schwammle & Jensen (2010)
cl = mf.mfuzz(data, c=16, m=m, random_state=0)
# 4. extract core genes and plot
cores = mf.acore(data, cl, min_acore=0.5)
fig = mf.mfuzz_plot(data, cl, mfrow=(4, 4))
API
| Group | Functions |
|---|---|
| Data structures | ExpressionMatrix, as_expression_matrix, FClust, KMeansResult, AcoreCluster, PartcoefResult |
| Preprocessing | standardise, standardise2, filter_NA, fill_NA, filter_std |
| Clustering | mestimate, mfuzz, cmeans |
| Diagnostics | acore, Dmin, cselection, partcoef, overlap |
| Hard clustering | kmeans2 |
| Plotting | mfuzz_plot, mfuzz_plot2, kmeans2_plot, overlap_plot |
| Datasets | load_yeast, make_synthetic_timecourse |
Mapping to the R package
| Mfuzz (R) | pymfuzz (Python) |
|---|---|
standardise / standardise2 |
standardise / standardise2 |
mestimate |
mestimate |
mfuzz (e1071::cmeans) |
mfuzz / cmeans |
acore |
acore |
Dmin, cselection, partcoef |
Dmin, cselection, partcoef |
filter.NA, fill.NA, filter.std |
filter_NA, fill_NA, filter_std |
overlap, overlap.plot |
overlap, overlap_plot |
mfuzz.plot, mfuzz.plot2 |
mfuzz_plot, mfuzz_plot2 |
kmeans2, kmeans2.plot |
kmeans2, kmeans2_plot |
R parity
Validated against Mfuzz 2.66.0 / e1071 1.7.17 on the bundled
yeast cell-cycle time-course (data(yeast)):
| Routine | Agreement vs R |
|---|---|
standardise |
bit-exact (rel-diff ≈ 1e-15) |
mestimate |
bit-exact (rel-diff ≈ 1e-15) |
fill_NA(knn) |
bit-exact (max abs diff ≈ 1e-15) |
mfuzz |
membership Pearson r = 1.0, centres r = 1.0, hard-assignment ARI ≈ 0.99 |
Dmin |
curve Pearson r = 1.0 |
standardise, mestimate and fill_NA are deterministic and match R to
machine precision. Fuzzy c-means uses random initialisation, so a
bit-exact match across RNGs is not expected; instead clustering
agreement is asserted (Hungarian-matched membership correlation and
Adjusted Rand Index). Because the yeast fuzzifier (m ≈ 1.15) is close
to 1, fuzzy c-means is sharp and slow to converge — both sides take the
best of several converged restarts so they reach the same optimum.
Run the parity tests (needs the CMAP R environment):
python -m pytest tests/ -q
License
GPL-2 — the same license as the original Bioconductor Mfuzz package.
See LICENSE.
Citation
If you use pymfuzz, please cite the original Mfuzz papers:
- L. Kumar, M. Futschik (2007). Mfuzz: a software package for soft clustering of microarray data. Bioinformation 2(1):5–7.
- M. Futschik, B. Carlisle (2005). Noise-robust soft clustering of gene expression time-course data. J. Bioinform. Comput. Biol. 3(4):965–988.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymfuzz-0.1.0.tar.gz.
File metadata
- Download URL: pymfuzz-0.1.0.tar.gz
- Upload date:
- Size: 42.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c26a45d7ac4c5eda2f99ba92bb54bde13c9b6bf99898d3770de816a3d152c5a9
|
|
| MD5 |
9714ae0425580e06350f1beccc0ed5a8
|
|
| BLAKE2b-256 |
254887dacb6a5e13f0bdeee215cde5799aaaba91a9a773221dbfa1929bdf5874
|
Provenance
The following attestation bundles were made for pymfuzz-0.1.0.tar.gz:
Publisher:
publish.yml on omicverse/py-mfuzz
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pymfuzz-0.1.0.tar.gz -
Subject digest:
c26a45d7ac4c5eda2f99ba92bb54bde13c9b6bf99898d3770de816a3d152c5a9 - Sigstore transparency entry: 1587918976
- Sigstore integration time:
-
Permalink:
omicverse/py-mfuzz@45b2774d8e410f3fc09b81d8880210684b6ecc2a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@45b2774d8e410f3fc09b81d8880210684b6ecc2a -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pymfuzz-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pymfuzz-0.1.0-py3-none-any.whl
- Upload date:
- Size: 38.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5cd5058c967a3a68973dd927ac767e2921cc3e73dfd07906d02b18645faff8c0
|
|
| MD5 |
6ffec6b8826d397d9a9501c5dd374b8b
|
|
| BLAKE2b-256 |
97c47f272e5e0aaf2115ba92c85558fd8adb16e96f869df6742964ad799aad79
|
Provenance
The following attestation bundles were made for pymfuzz-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on omicverse/py-mfuzz
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pymfuzz-0.1.0-py3-none-any.whl -
Subject digest:
5cd5058c967a3a68973dd927ac767e2921cc3e73dfd07906d02b18645faff8c0 - Sigstore transparency entry: 1587919019
- Sigstore integration time:
-
Permalink:
omicverse/py-mfuzz@45b2774d8e410f3fc09b81d8880210684b6ecc2a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@45b2774d8e410f3fc09b81d8880210684b6ecc2a -
Trigger Event:
workflow_dispatch
-
Statement type: