Skip to main content

Pure-Python port of the R/CRAN package dowser - B-cell receptor phylogenetics: lineage trees, measurable evolution, migration/differentiation tests (Immcantation).

Project description

py-dowser

Pure-Python port of the R/CRAN package dowser — the B-cell receptor phylogenetics toolkit of the Immcantation framework (Hoehn, Pybus & Kleinstein, PLoS Comput. Biol. 2022).

pydowser builds and analyses B-cell lineage trees from clonal AIRR data: lineage-tree inference, measurable-evolution tests, and migration / differentiation / class-switching tests — all in pure Python (numpy / scipy / pandas / matplotlib + biopython). No rpy2, no external phylogenetics binaries (IgPhyML / RAxML / PHYLIP): the tree building and ancestral reconstruction are re-implemented from scratch.

import pydowser as dw

airr   = dw.load_example_airr()                       # dowser's ExampleAirr
clones = dw.formatClones(airr, traits=["sample_id"])   # -> per-clone AirrClone objects
trees  = dw.getTrees(clones, build="pratchet")         # maximum-parsimony lineage trees
dw.plotTrees(trees, trait="sample_id")                 # matplotlib lineage figures

Installation

pip install pydowser

Depends on the already-ported pyalakazam (sequence masking, duplicate collapsing, gene-call parsing) and pyshazam (IMGT region boundaries).

What is ported

dowser function pydowser notes
formatClones, makeAirrClone formatClones, makeAirrClone gap/end masking, duplicate collapse, trait handling, IMGT regions
getTrees (build="pratchet") getTrees, buildPratchet maximum parsimony — faithful, parsimony score exact
getTrees (build="pml") getTrees, buildPML maximum likelihood — tractable JC approximation (see caveats)
findSwitches findSwitches, countSwitches parsimony trait reconstruction (see caveats)
testPS, testSP, testSC testPS, testSP, testSC PS / SP / SC migration & differentiation tests
correlationTest correlationTest, runCorrelationTest root-to-tip date-randomisation test
getBootstraps getBootstraps non-parametric bootstrap node support
collapseNodes, scaleBranches, resolvePolytomies same names tree editing
getDivergence, getPathLengths, getNodeSeq, getSeq same names tree metrics & sequence retrieval
calcRF, rerootTree, readFasta same names Robinson-Foulds, re-rooting, FASTA I/O
plotTrees, getPalette plotTrees, plotTree, getPalette matplotlib lineage figures

The parsimony engine (fitch_score, sankoff_score, pratchet, acctran, ancestral_pars, fitch_states) is exposed directly.

R-parity

Validated against dowser 2.4.1 on its bundled ExampleAirr dataset (tests/test_r_parity.py, skipped automatically when R is unavailable):

  • formatClones — per-clone sequence counts and germline lengths match R exactly.
  • getTrees / parsimony — the maximum-parsimony score matches R exactly for the great majority of clones; for a small fraction of large, hard clones the heuristic ratchet search ends at most one step above the global optimum (topology may differ for ambiguous clones — Robinson-Foulds distance asserted small).
  • getDivergence — root-to-tip divergences match R closely.
  • correlationTest — the observed correlation and slope match R; the permutation p-value matches within Monte-Carlo tolerance.

Phylogenetics caveats (read this)

dowser delegates tree building to external binaries (IgPhyML, RAxML, PHYLIP dnapars/dnaml). pydowser re-implements the algorithms in pure Python, with these honest limitations:

  1. Maximum parsimony (build="pratchet") is the R-parity reference. The Fitch parsimony score is exact; the parsimony-ratchet topology search (NNI + SPR + bootstrap-reweighting perturbations) is a heuristic — for clones with many equally-parsimonious topologies the recovered tree may differ from R's, and very large clones may occasionally land one step above the global optimum.
  2. Maximum likelihood (build="pml") is a tractable approximation: Felsenstein pruning under the Jukes-Cantor model with NNI search and golden-section branch optimisation. It is not a bit-for-bit reproduction of phangorn::optim.pml's GTR+gamma optimiser.
  3. findSwitches reconstructs discrete trait states by maximum parsimony (Fitch down/up pass) rather than IgPhyML's likelihood model. The PS / SP / SC statistics and their permutation nulls are reproduced faithfully; only the likelihood-weighted tie-breaking of ambiguous states differs.
  4. Trees are stored exactly as in ape (1-based edge matrix, tip.label, Nnode, nodes), so results interoperate cleanly with the rest of the workflow.

Reference

Hoehn KB, Pybus OG, Kleinstein SH (2022). Phylogenetic analysis of migration, differentiation, and class switching in B cells. PLoS Computational Biology. https://doi.org/10.1371/journal.pcbi.1009885

License

GNU Affero General Public License v3 — the same license as the upstream dowser package. Original dowser © the dowser authors (Kleinstein lab, Immcantation framework).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydowser-0.1.0.tar.gz (207.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydowser-0.1.0-py3-none-any.whl (200.1 kB view details)

Uploaded Python 3

File details

Details for the file pydowser-0.1.0.tar.gz.

File metadata

  • Download URL: pydowser-0.1.0.tar.gz
  • Upload date:
  • Size: 207.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydowser-0.1.0.tar.gz
Algorithm Hash digest
SHA256 03be2127cbbd20dc72ab2c5dfa4421721bd3eba8235eccec40540a55e0642762
MD5 707292bfb2f9c99a53ac95e7af9216c3
BLAKE2b-256 8ebe5a9ba3d28cb6265b477578aef4f095d52aeade6879e22ae4436683287120

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydowser-0.1.0.tar.gz:

Publisher: publish.yml on omicverse/py-dowser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydowser-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pydowser-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 200.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydowser-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e1bcfc4fee56be7be710cf919b25397444ac574ba7ec4fc05a9468260e30e291
MD5 a07418ad83c138e2d7585df973887434
BLAKE2b-256 b25aa9c2ed0b0f8846f84380805479d64d304d917af08620ef5d2924a41da038

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydowser-0.1.0-py3-none-any.whl:

Publisher: publish.yml on omicverse/py-dowser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page