Pure-Python port of Olink Proteomics' R OlinkAnalyze — NPX I/O, bridge normalization, and per-protein differential expression for Olink proteomics.

These details have not been verified by PyPI

Project links

Project description

pyolinkanalyze

A pure-Python port of R OlinkAnalyze (Olink Proteomics AB) — 100 % coverage of the OlinkAnalyze 3.8.2 public API: NPX I/O (CSV / TSV / Excel), bridge / subset / N-way normalization, per-protein differential expression (t-test, Wilcoxon, LMM, ANOVA, Kruskal-Wallis / Friedman, ordinal regression, plus post-hoc contrasts), limit-of-detection handling, plate randomization, plate-layout / distribution plots, pathway enrichment, and a full set of matplotlib plots.

No rpy2, no R install. Welch t-test via scipy.stats.ttest_ind(equal_var=False), Mann-Whitney via scipy.stats.mannwhitneyu(use_continuity=True), LMM via statsmodels.regression.mixed_linear_model.MixedLM, type-III ANOVA via statsmodels + sum-to-zero contrasts, ordinal regression via statsmodels.miscmodels.ordinal_model.OrderedModel.
Tidy long-format pandas.DataFrame interface — the same NPX schema Olink ships in their Explore / Target CSVs.
R-parity tests against OlinkAnalyze 3.8.2 — Pearson r > 0.99 (often =1.0) on per-protein test statistics and p-values for t-test, Wilcoxon, LMM, ANOVA and Kruskal-Wallis.

This is a standalone mirror of the canonical implementation that lives in omicverse. All algorithmic work is developed upstream in omicverse and synced here.

Install

pip install pyolinkanalyze

Dependencies: numpy, scipy, pandas, statsmodels. Plotting needs matplotlib + scikit-learn (pip install pyolinkanalyze[plotting]); olink_umap_plot optionally uses umap-learn (pip install pyolinkanalyze[umap]) and falls back to PCA otherwise.

Quick-start

import pyolinkanalyze as pa

# Load Olink long-format NPX CSV (auto-detects ; vs , separators)
npx = pa.read_npx_csv("study_NPX_2024.csv")

# Differential expression: two-group Welch t-test per protein
res = pa.olink_ttest(npx, variable="Treatment")
res.head()
# OlinkID  Assay     UniProt  term            estimate  statistic  p.value   Adjusted_pval
# OID00012 IL6       P05231   group1 - group0    1.84    5.12      1.2e-5    8.6e-4
# ...

# Non-parametric alternative
res_w = pa.olink_wilcox(npx, variable="Treatment")

# Linear mixed-effects: NPX ~ Treatment + (1|Subject), per protein
res_lmm = pa.olink_lmer(npx, variable="Treatment", random="Subject")

# Bridge normalization across two batches (4 overlapping samples)
df_ref = pa.read_npx_csv("batch_A.csv")
df_target = pa.read_npx_csv("batch_B.csv")
joined = pa.olink_normalization(
    df_ref, df_target,
    overlapping_samples_df1=["B01", "B02", "B03", "B04"],
    overlapping_samples_df2=["B01", "B02", "B03", "B04"],
)

More tests (v0.2):

# Multi-group ANOVA + Tukey post-hoc
res_av = pa.olink_anova(npx, variable="Group")
res_ph = pa.olink_anova_posthoc(npx, variable="Group", effect="Group")

# Non-parametric (Kruskal-Wallis) + Dunn post-hoc
res_kw = pa.olink_one_non_parametric(npx, variable="Group")
res_dunn = pa.olink_one_non_parametric_posthoc(npx, variable="Group")

# Ordinal regression
res_ord = pa.olink_ordinal_regression(npx, variable="Group")

# Limit of detection (negative-control estimate) + below-LOD flags
npx_lod = pa.olink_lod(npx, lod_method="NCLOD")

# Pick optimal bridging samples
bridges = pa.olink_bridge_selector(npx, sample_missing_freq=0.1, n=8)

# Randomize a sample manifest across plates
plated = pa.olink_plate_randomizer(manifest, subject_col="Subject", seed=0)

# Pathway enrichment on a DE result
gene_sets = pa.read_gmt("hallmark.gmt")
enr = pa.olink_pathway_enrichment(res, gene_sets, method="gsea")

Plotting helpers:

import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 2, figsize=(12, 4))
pa.olink_volcano_plot(res, ax=axes[0])
pa.olink_qc_plot(npx, ax=axes[1])

# v0.2 plots
pa.olink_pca_plot(npx, color_by="Treatment")
pa.olink_heatmap_plot(npx)
pa.olink_boxplot(npx, "Treatment", olinkids=["OID00012"])
pa.olink_pathway_heatmap(enr)

# v0.2.1 — plate QC plots + general NPX reader
plated = pa.olink_plate_randomizer(manifest, seed=0)
pa.olink_display_plate_distributions(plated, fill_color="Treatment")
pa.olink_display_plate_layout(plated, color_by="Treatment")
npx = pa.read_npx("study_NPX_2024.xlsx")   # dispatches CSV / TSV / Excel

API coverage (v0.2.1)

100 % of the R OlinkAnalyze 3.8.2 public API is ported. The only names not mapped to Python functions are %>% (the R pipe) and manifest / npx_data1 / npx_data2 (bundled example datasets) — these are not functions.

I/O & normalization

Python	R counterpart
`read_npx`	`read_NPX` (dispatches CSV / TSV / Excel) ✅
`read_npx_csv`	`read_NPX` (long-format CSV path) ✅
`read_npx_excel`	`read_NPX` (`.xlsx` / `.xls` Olink export) ✅
`olink_normalization`	`olink_normalization` (bridge, difference-of-medians)
`olink_normalization_reference_medians`	`olink_normalization(reference_medians=…)`
`olink_normalization_bridge`	`olink_normalization_bridge` (paired median-of-diffs)
`olink_normalization_subset`	`olink_normalization_subset`
`olink_normalization_n`	`olink_normalization_n` (N-way chain / tree)
`olink_bridge_selector`	`olink_bridgeselector`

Statistical tests & post-hoc

Python	R counterpart
`olink_ttest`	`olink_ttest` (paired support)
`olink_wilcox`	`olink_wilcox`
`olink_lmer`	`olink_lmer`
`olink_lmer_posthoc`	`olink_lmer_posthoc` (Wald pairwise contrasts)
`olink_anova`	`olink_anova` (type-III, `contr.sum`)
`olink_anova_posthoc`	`olink_anova_posthoc` (Tukey HSD)
`olink_one_non_parametric`	`olink_one_non_parametric` (Kruskal / Friedman)
`olink_one_non_parametric_posthoc`	`olink_one_non_parametric_posthoc` (Dunn / paired Wilcoxon)
`olink_ordinal_regression`	`olink_ordinalRegression`
`olink_ordinal_regression_posthoc`	`olink_ordinalRegression_posthoc`

LOD, study design & pathway

Python	R counterpart
`olink_lod`	`olink_lod` (`NCLOD` / `FixedLOD`)
`olink_plate_randomizer`	`olink_plate_randomizer`
`olink_pathway_enrichment`	`olink_pathway_enrichment` (self-contained GSEA / ORA)
`read_gmt`	(helper — load gene sets)

Plotting (matplotlib)

Python	R counterpart
`olink_volcano_plot`	`olink_volcano_plot`
`olink_qc_plot`	`olink_qc_plot`
`olink_boxplot`	`olink_boxplot`
`olink_dist_plot`	`olink_dist_plot`
`olink_pca_plot`	`olink_pca_plot` (`sklearn.decomposition.PCA`)
`olink_umap_plot`	`olink_umap_plot` (`umap-learn`, PCA fallback)
`olink_heatmap_plot`	`olink_heatmap_plot`
`olink_lmer_plot`	`olink_lmer_plot`
`olink_pathway_heatmap`	`olink_pathway_heatmap`
`olink_pathway_visualization`	`olink_pathway_visualization`
`olink_display_plate_distributions`	`olink_displayPlateDistributions` ✅
`olink_display_plate_layout`	`olink_displayPlateLayout` ✅
`olink_pal`, `set_plot_theme`, `olink_color_discrete`, `olink_fill_discrete`, `olink_color_gradient`, `olink_fill_gradient`	same names

Not Python functions

R name	Reason
`%>%`	R magrittr pipe — a language operator, not a function to port
`manifest`, `npx_data1`, `npx_data2`	bundled example datasets, not functions

Every other function in R OlinkAnalyze 3.8.2 has a Python counterpart in the tables above.

R-parity

tests/test_r_parity.py (auto-skipped if OlinkAnalyze isn't installed in the CMAP R env) compares against OlinkAnalyze 3.8.2:

Quantity	Result
`olink_ttest` `estimate` (mean diff)	`atol=1e-8`
`olink_ttest` `statistic` / `p.value`	Pearson r > 0.99
`olink_wilcox` `statistic` / `p.value`	`
`olink_lmer` F-vs-t² / `p.value`	Pearson r > 0.95
`olink_anova` F-statistic / `p.value`	Pearson r = 1.0000 (50 proteins)
`olink_one_non_parametric` Kruskal stat / `p.value`	Pearson r = 1.0000 (50 proteins)
`olink_bridge_selector` selected sample set	100 % overlap with R
`olink_lod` below-LOD flags	> 95 % agreement

Benchmark

200 proteins × 32 samples, 2 groups:

python examples/benchmark.py --runs 2

Typical Python pipeline wall-time:

Function	Python (ms)
`olink_ttest`	~400
`olink_wilcox`	~255

(LMM is dominated by statsmodels' per-protein fit — call out n_jobs parallelism in v0.2.)

Notes on the algorithm match

t-test: Welch unequal-variance with the Satterthwaite DF formula. scipy.stats.ttest_ind(equal_var=False) matches R t.test(var.equal=FALSE) exactly.
Wilcoxon: Asymptotic Mann-Whitney U with Yates continuity correction (scipy.stats.mannwhitneyu(use_continuity=True, method='asymptotic')) matches R wilcox.test(exact=FALSE, correct=TRUE). Note R reports W = U_{g1} while scipy reports U_1 for the first sample — Pearson r is essentially ±1 depending on group ordering.
LMM: statsmodels.mixedlm fits ML by default (set reml=False to match lme4::lmer(REML=FALSE)). For REML, pass reml=True to the underlying model — fixed-effect coefficients agree at ~1e-5.
BH adjustment: false_discovery_control(method='bh') matches stats::p.adjust(method='BH') exactly.

Reproducing R results exactly

# Requires OlinkAnalyze in the CMAP R env
pytest tests/test_r_parity.py -v

Relationship to omicverse

Developed upstream in omicverse:

Canonical implementation: omicverse.protein.tl.de(adata, method='ttest', platform='olink')
Standalone mirror (this repo): same code, same API, minus the omicverse packaging.

Citation

If you use this package, please cite the upstream OlinkAnalyze package:

Olink Proteomics AB. OlinkAnalyze: Facilitate Analysis of Proteomic Data from Olink. R package version 5.0.0. https://cran.r-project.org/package=OlinkAnalyze

…and acknowledge omicverse / this repo for the Python port.

License

AGPL-3.0 — matches the upstream CRAN package.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

May 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyolinkanalyze-0.2.1.tar.gz (47.2 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyolinkanalyze-0.2.1-py3-none-any.whl (39.0 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file pyolinkanalyze-0.2.1.tar.gz.

File metadata

Download URL: pyolinkanalyze-0.2.1.tar.gz
Upload date: May 20, 2026
Size: 47.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pyolinkanalyze-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`f5340175240351495c4c96ab74d61f8bb84e7bf18fe7ac0880503f1bd36a4258`
MD5	`ebaecec0b3e2ea87fb69857cb7182a3e`
BLAKE2b-256	`b136c381745458bad0dba9f66a561cd9aa3a9ae5d623700fa75f8a5dc7fd97bc`

See more details on using hashes here.

File details

Details for the file pyolinkanalyze-0.2.1-py3-none-any.whl.

File metadata

Download URL: pyolinkanalyze-0.2.1-py3-none-any.whl
Upload date: May 20, 2026
Size: 39.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for pyolinkanalyze-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f9a357f4e4fbbe706bf80f0da86a6e2bc5aa8b67df9104e4321242d0c651e58f`
MD5	`12ce0b502c7af56092fc2c0f3e472dbd`
BLAKE2b-256	`256d4318e47a1b118a0219b43b167c857f9097215d8619d375a5d1775ba59b59`

See more details on using hashes here.

pyolinkanalyze 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pyolinkanalyze

Install

Quick-start

API coverage (v0.2.1)

I/O & normalization

Statistical tests & post-hoc

LOD, study design & pathway

Plotting (matplotlib)

Not Python functions

R-parity

Benchmark

Notes on the algorithm match

Reproducing R results exactly

Relationship to omicverse

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes