Pure-Python port of the R/CRAN package dowser - B-cell receptor phylogenetics: lineage trees, measurable evolution, migration/differentiation tests (Immcantation).
Project description
py-dowser
Pure-Python port of the R/CRAN package dowser — the B-cell receptor phylogenetics toolkit of the Immcantation framework (Hoehn, Pybus & Kleinstein, PLoS Comput. Biol. 2022).
pydowser builds and analyses B-cell lineage trees from clonal AIRR data: lineage-tree inference, measurable-evolution tests, and migration / differentiation / class-switching tests — all in pure Python (numpy / scipy / pandas / matplotlib + biopython). No rpy2, no external phylogenetics binaries (IgPhyML / RAxML / PHYLIP): the tree building and ancestral reconstruction are re-implemented from scratch.
import pydowser as dw
airr = dw.load_example_airr() # dowser's ExampleAirr
clones = dw.formatClones(airr, traits=["sample_id"]) # -> per-clone AirrClone objects
trees = dw.getTrees(clones, build="pratchet") # maximum-parsimony lineage trees
dw.plotTrees(trees, trait="sample_id") # matplotlib lineage figures
Installation
pip install pydowser
Depends on the already-ported pyalakazam (sequence masking, duplicate collapsing, gene-call parsing) and pyshazam (IMGT region boundaries).
What is ported
| dowser function | pydowser | notes |
|---|---|---|
formatClones, makeAirrClone |
formatClones, makeAirrClone |
gap/end masking, duplicate collapse, trait handling, IMGT regions |
getTrees (build="pratchet") |
getTrees, buildPratchet |
maximum parsimony — faithful, parsimony score exact |
getTrees (build="pml") |
getTrees, buildPML |
maximum likelihood — tractable JC approximation (see caveats) |
findSwitches |
findSwitches, countSwitches |
parsimony trait reconstruction (see caveats) |
testPS, testSP, testSC |
testPS, testSP, testSC |
PS / SP / SC migration & differentiation tests |
correlationTest |
correlationTest, runCorrelationTest |
root-to-tip date-randomisation test |
getBootstraps |
getBootstraps |
non-parametric bootstrap node support |
collapseNodes, scaleBranches, resolvePolytomies |
same names | tree editing |
getDivergence, getPathLengths, getNodeSeq, getSeq |
same names | tree metrics & sequence retrieval |
calcRF, rerootTree, readFasta |
same names | Robinson-Foulds, re-rooting, FASTA I/O |
plotTrees, getPalette |
plotTrees, plotTree, getPalette |
matplotlib lineage figures |
The parsimony engine (fitch_score, sankoff_score, pratchet, acctran, ancestral_pars, fitch_states) is exposed directly.
R-parity
Validated against dowser 2.4.1 on its bundled ExampleAirr dataset (tests/test_r_parity.py, skipped automatically when R is unavailable):
formatClones— per-clone sequence counts and germline lengths match R exactly.getTrees/ parsimony — the maximum-parsimony score matches R exactly for the great majority of clones; for a small fraction of large, hard clones the heuristic ratchet search ends at most one step above the global optimum (topology may differ for ambiguous clones — Robinson-Foulds distance asserted small).getDivergence— root-to-tip divergences match R closely.correlationTest— the observed correlation and slope match R; the permutation p-value matches within Monte-Carlo tolerance.
Phylogenetics caveats (read this)
dowser delegates tree building to external binaries (IgPhyML, RAxML, PHYLIP dnapars/dnaml). pydowser re-implements the algorithms in pure Python, with these honest limitations:
- Maximum parsimony (
build="pratchet") is the R-parity reference. The Fitch parsimony score is exact; the parsimony-ratchet topology search (NNI + SPR + bootstrap-reweighting perturbations) is a heuristic — for clones with many equally-parsimonious topologies the recovered tree may differ from R's, and very large clones may occasionally land one step above the global optimum. - Maximum likelihood (
build="pml") is a tractable approximation: Felsenstein pruning under the Jukes-Cantor model with NNI search and golden-section branch optimisation. It is not a bit-for-bit reproduction ofphangorn::optim.pml's GTR+gamma optimiser. findSwitchesreconstructs discrete trait states by maximum parsimony (Fitch down/up pass) rather than IgPhyML's likelihood model. The PS / SP / SC statistics and their permutation nulls are reproduced faithfully; only the likelihood-weighted tie-breaking of ambiguous states differs.- Trees are stored exactly as in
ape(1-basededgematrix,tip.label,Nnode,nodes), so results interoperate cleanly with the rest of the workflow.
Reference
Hoehn KB, Pybus OG, Kleinstein SH (2022). Phylogenetic analysis of migration, differentiation, and class switching in B cells. PLoS Computational Biology. https://doi.org/10.1371/journal.pcbi.1009885
License
GNU Affero General Public License v3 — the same license as the upstream dowser package. Original dowser © the dowser authors (Kleinstein lab, Immcantation framework).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydowser-0.1.0.tar.gz.
File metadata
- Download URL: pydowser-0.1.0.tar.gz
- Upload date:
- Size: 207.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03be2127cbbd20dc72ab2c5dfa4421721bd3eba8235eccec40540a55e0642762
|
|
| MD5 |
707292bfb2f9c99a53ac95e7af9216c3
|
|
| BLAKE2b-256 |
8ebe5a9ba3d28cb6265b477578aef4f095d52aeade6879e22ae4436683287120
|
Provenance
The following attestation bundles were made for pydowser-0.1.0.tar.gz:
Publisher:
publish.yml on omicverse/py-dowser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pydowser-0.1.0.tar.gz -
Subject digest:
03be2127cbbd20dc72ab2c5dfa4421721bd3eba8235eccec40540a55e0642762 - Sigstore transparency entry: 1591004315
- Sigstore integration time:
-
Permalink:
omicverse/py-dowser@566c759e456ca092780ad9a630a8162737588f55 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@566c759e456ca092780ad9a630a8162737588f55 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pydowser-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pydowser-0.1.0-py3-none-any.whl
- Upload date:
- Size: 200.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1bcfc4fee56be7be710cf919b25397444ac574ba7ec4fc05a9468260e30e291
|
|
| MD5 |
a07418ad83c138e2d7585df973887434
|
|
| BLAKE2b-256 |
b25aa9c2ed0b0f8846f84380805479d64d304d917af08620ef5d2924a41da038
|
Provenance
The following attestation bundles were made for pydowser-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on omicverse/py-dowser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pydowser-0.1.0-py3-none-any.whl -
Subject digest:
e1bcfc4fee56be7be710cf919b25397444ac574ba7ec4fc05a9468260e30e291 - Sigstore transparency entry: 1591004318
- Sigstore integration time:
-
Permalink:
omicverse/py-dowser@566c759e456ca092780ad9a630a8162737588f55 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@566c759e456ca092780ad9a630a8162737588f55 -
Trigger Event:
workflow_dispatch
-
Statement type: