Pure-Python port of Bioconductor lipidr — lipidomics data analysis, normalization, differential analysis and lipid-set enrichment (LSEA).
Project description
py-lipidr
Pure-Python port of the Bioconductor lipidr lipidomics analysis toolkit (Mohamed, Molendijk & Hill, J. Proteome Res. 2020, 19(7):2890-2897).
pylipidr is a standalone, dependency-light implementation of lipidr's
computational core: data import, lipid-name annotation, QC,
normalization, differential analysis and Lipid Set Enrichment Analysis
(LSEA). It does not require R.
| PyPI / import name | pylipidr |
| License | MIT (same as upstream lipidr) |
| Upstream | Bioconductor lipidr 2.20.0 |
Why this port reuses two existing engines
- Lipid-name parsing ->
pygoslin, the reference Goslin lipid-name grammar. lipidr's regex-based name parser is replaced bypygoslin, which is more robust and standards-based. - Moderated-t differential analysis ->
python-limma(pylimma). lipidr'sde_analysiscalls limma under the hood in R; the Python port calls the published pure-Python limma port instead of reimplementing it.
Install
pip install pylipidr # once published
# or, from a checkout:
pip install -e .
Dependencies: numpy, scipy, pandas, anndata, pygoslin,
python-limma.
Quick start
import pylipidr as lp
# 1. read a Skyline CSV export -> a LipidomicsExperiment (AnnData-backed)
exp = lp.read_skyline("A1_data.csv")
# 2. attach clinical / sample metadata
exp = lp.add_sample_annotation(exp, "clin.csv")
# 3. collapse multiple transitions per lipid
exp = lp.summarize_transitions(exp, method="max")
# 4. QC + normalization
exp = lp.filter_by_cv(exp, cv_cutoff=20.0)
exp = lp.normalize_pqn(exp, measure="Area") # log2 + PQN
# 5. moderated-t differential analysis (limma)
de = lp.de_analysis(exp, "HighFat - Normal", group_col="group")
hits = lp.significant_molecules(de, p_cutoff=0.05, logfc_cutoff=1.0)
# 6. Lipid Set Enrichment Analysis
enr = lp.lsea(de, rank_by="logFC")
sets = lp.significant_lipidsets(enr, p_cutoff=0.05)
The LipidomicsExperiment
LipidomicsExperiment wraps an anndata.AnnData (samples x lipids):
.adata.var-- per-lipid annotations (Class,Category,total_cl,total_cs,chains,istd, ...)..adata.obs-- per-sample clinical data..adata.X/.adata.layers-- one or more intensity measures.- processing-state flags
is_logged/is_normalized/is_summarizedare stored in.adata.unsand toggled withset_loggedetc.
What is ported
| lipidr (R) | pylipidr | notes |
|---|---|---|
LipidomicsExperiment, as_lipidomics_experiment |
LipidomicsExperiment, as_lipidomics_experiment |
AnnData-backed |
read_skyline |
read_skyline |
Skyline CSV export(s) |
read_mwTab |
read_mwtab |
Metabolomics Workbench mwTab |
read_mw_datamatrix |
read_mw_datamatrix |
MW data matrix TSV |
annotate_lipids |
annotate_lipids, annotate_experiment |
pygoslin-backed |
non_parsed_molecules, remove_non_parsed_molecules, update_molecule_names |
same names | |
filter_by_cv |
filter_by_cv |
CV filter |
impute_na |
impute_na |
knn / min / minDet / minProb / zero |
summarize_transitions |
summarize_transitions |
max / average |
normalize_pqn |
normalize_pqn |
probabilistic quotient normalization |
normalize_istd |
normalize_istd |
per-class internal-standard normalization |
de_design, de_analysis |
de_design, de_analysis |
moderated-t via pylimma |
significant_molecules |
significant_molecules |
|
top_lipids |
top_lipids |
ranks DE result (see note below) |
gen_lipidsets |
gen_lipidsets |
by class / chain length / unsaturation |
lsea |
lsea |
preranked GSEA (fgsea-style) |
significant_lipidsets |
significant_lipidsets |
What is NOT ported (deferred to v0.2)
These are deliberately out of scope for v0.1 and are documented here as deferred:
mva-- PCA / PLS-DA / OPLS-DA multivariate analysis. omicverse already provides multivariate tooling; lipidr'stop_lipidsnormally operates onmvaloadings, so the v0.1top_lipidsinstead ranks thede_analysisresult.- All
plot_*functions --plot_samples,plot_molecules,plot_lipidclass,plot_chain_distribution,plot_results_volcano,plot_enrichment,plot_trend,plot_heatmap, etc. use_interactive_graphics-- interactive plotly toggling.fetch_mw_study/list_mw_studies-- network helpers for the Metabolomics Workbench REST API.
R-parity
pylipidr is validated against Bioconductor lipidr 2.20.0 on lipidr's
own bundled Skyline example dataset (extdata/A1_data.csv + clin.csv),
so both languages analyse identical input. Numbers from
examples/benchmark.py:
| step | metric | result |
|---|---|---|
annotate_lipids |
lipid-class agreement | 0.99 |
normalize_pqn |
Pearson r of normalized values | 1.000 |
normalize_istd |
Pearson r of normalized values | 0.997 |
de_analysis |
Pearson r of logFC | 1.000 |
de_analysis |
Pearson r of p-values | 1.000 |
lsea |
Pearson r of enrichment scores | 0.95 |
lsea |
Pearson r of p-values | 0.91 |
lsea agrees within target tolerance; small differences arise because R
lipidr's fgsea uses an adaptive multilevel permutation scheme while
pylipidr uses a fixed gene-permutation null. The significantly
enriched lipid sets agree.
Run the parity suite (skips gracefully if R is unavailable):
pytest tests/ -v
tests/test_smoke.py-- 18 algorithmic tests, no R needed.tests/test_r_parity.py-- 8 tests vs Bioconductor lipidr.
Benchmark
python examples/benchmark.py --runs 2
On the bundled example the full Python pipeline runs roughly 8x
faster than the equivalent R pipeline (mostly by skipping Rscript /
Bioconductor startup). See examples/compare_R_vs_Python.ipynb.
Citation
If you use pylipidr, please cite the original lipidr paper:
Mohamed A, Molendijk J, Hill MM. lipidr: A Software Tool for Data Mining and Analysis of Lipidomics Datasets. J. Proteome Res. 2020, 19(7):2890-2897. doi:10.1021/acs.jproteome.0c00082
and, for the reused engines, the Goslin (Kopczynski et al., Anal. Chem. 2020) and limma (Ritchie et al., Nucleic Acids Res. 2015) papers.
License
MIT -- the same license as upstream lipidr. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pylipidr-0.1.0.tar.gz.
File metadata
- Download URL: pylipidr-0.1.0.tar.gz
- Upload date:
- Size: 33.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5792b699172896bc5519ee84627fe62211e727158c27a6f6c7e1d18fdf63667f
|
|
| MD5 |
44b7607b5cd28e2fb65dd7e94d2b4db7
|
|
| BLAKE2b-256 |
64139b57e99190a45f88bd11730467eb55bb39db12ac369621e122b0c76c5998
|
Provenance
The following attestation bundles were made for pylipidr-0.1.0.tar.gz:
Publisher:
publish.yml on omicverse/py-lipidr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pylipidr-0.1.0.tar.gz -
Subject digest:
5792b699172896bc5519ee84627fe62211e727158c27a6f6c7e1d18fdf63667f - Sigstore transparency entry: 1580950156
- Sigstore integration time:
-
Permalink:
omicverse/py-lipidr@96ef9b181eb59eb020a994c810af7dc975ad4e06 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@96ef9b181eb59eb020a994c810af7dc975ad4e06 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pylipidr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pylipidr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 29.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
953c8c9d2a83a6a20ef4b52c9a7fd0a95d9da42d7621ae36818f82b369ee41ad
|
|
| MD5 |
b6b0bbe888f59383df916c22a74aa2cf
|
|
| BLAKE2b-256 |
a60f3259b1803b0cca1f92fd85a034a82b397ea0e32db8abb037d389072a12a2
|
Provenance
The following attestation bundles were made for pylipidr-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on omicverse/py-lipidr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pylipidr-0.1.0-py3-none-any.whl -
Subject digest:
953c8c9d2a83a6a20ef4b52c9a7fd0a95d9da42d7621ae36818f82b369ee41ad - Sigstore transparency entry: 1580950494
- Sigstore integration time:
-
Permalink:
omicverse/py-lipidr@96ef9b181eb59eb020a994c810af7dc975ad4e06 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@96ef9b181eb59eb020a994c810af7dc975ad4e06 -
Trigger Event:
workflow_dispatch
-
Statement type: