Skip to main content

hstrat enables phylogenetic inference on distributed digital evolution populations

Project description

hstrat wordmark

PyPi codecov Codacy Badge CI Read The Docs GitHub stars Zenodo JOSS

hstrat enables phylogenetic inference on distributed digital evolution populations

Install

python3 -m pip install hstrat

Features

hstrat serves to enable robust, efficient extraction of evolutionary history from evolutionary simulations where centralized, direct phylogenetic tracking is not feasible. Namely, in large-scale, decentralized parallel/distributed evolutionary simulations, where agents' evolutionary lineages migrate among many cooperating processors over the course of simulation.

hstrat can

  • accurately estimate time since MRCA among two or several digital agents, even for uneven branch lengths
  • reconstruct phylogenetic trees for entire populations of evolving digital agents
  • serialize genome annotations to/from text and binary formats
  • provide low-footprint genome annotations (e.g., reasonably as low as 64 bits each)
  • be directly configured to satisfy memory use limits and/or inference accuracy requirements

hstrat operates just as well in single-processor simulation, but direct phylogenetic tracking using a tool like phylotrackpy should usually be preferred in such cases due to its capability for perfect record-keeping given centralized global simulation observability.

Example Usage

This code briefly demonstrates,

  1. initialization of a population of HereditaryStratigraphicColumn of objects,
  2. generation-to-generation transmission of HereditaryStratigraphicColumn objects with simple synchronous turnover, and then
  3. reconstruction of phylogenetic history from the final population of HereditaryStratigraphicColumn objects.
from random import choice as rchoice
import alifedata_phyloinformatics_convert as apc
from hstrat import hstrat; print(f"{hstrat.__version__=}")  # when last ran?
from hstrat._auxiliary_lib import seed_random; seed_random(1)  # reproducibility

# initialize a small population of hstrat instrumentation
# (in full simulations, each column would be attached to an individual genome)
population = [hstrat.HereditaryStratigraphicColumn() for __ in range(5)]

# evolve population for 40 generations under drift
for _generation in range(40):
    population = [rchoice(population).CloneDescendant() for __ in population]

# reconstruct estimate of phylogenetic history
alifestd_df = hstrat.build_tree(population, version_pin=hstrat.__version__)
tree_ascii = apc.RosettaTree(alifestd_df).as_dendropy.as_ascii_plot(width=20)
print(tree_ascii)
hstrat.__version__='1.8.8'
              /--- 1
          /---+
       /--+   \--- 3
       |  |
   /---+  \------- 2
   |   |
+--+   \---------- 0
   |
   \-------------- 4

In actual usage, each hstrat column would be bundled with underlying genetic material of interest in the simulation --- entire genomes or, in systems with sexual recombination, individual genes. The hstrat columns are designed to operate as a neutral genetic annotation, enhancing observability of the simulation but not affecting its outcome.

How it Works

In order to enable phylogenetic inference over fully-distributed evolutionary simulation, hereditary stratigraphy adopts a paradigm akin to phylogenetic work in natural history/biology. In these fields, phylogenetic history is inferred through comparisons among genetic material of extant organisms, with --- in broad terms --- phylogenetic relatedness established through the extent of genetic similarity between organisms. Phylogenetic tracking through hstrat, similarly, is achieved through analysis of similarity/dissimilarity among genetic material sampled over populations of interest.

Rather than random mutation as with natural genetic material, however, genetic material used by hstrat is structured through hereditary stratigraphy. This methodology, described fully in our documentation, provides strong guarantees on phylogenetic inferential power, minimizes memory footprint, and allows efficient reconstruction procedures.

See here for more detail on underlying hereditary stratigraphy methodology.

Getting Started

Refer to our documentation for a quickstart guide and an annotated end-to-end usage example.

The examples/ folder provides extensive usage examples, including

  • incorporation of hstrat annotations into a custom genome class,
  • automatic stratum retention policy parameterization,
  • pairwise and population-level phylogenetic inference, and
  • phylogenetic tree reconstruction.

Interested users can find an explanation of how hereditary stratigraphy methodology implemented by hstrat works "under the hood," information on project-specific hstrat configuration, and full API listing for the hstrat package in the documentation.

Citing

If hstrat software or hereditary stratigraphy methodology contributes to a scholarly work, please cite it according to references provided here. We would love to list your project using hstrat in our documentation, see more here.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

hcat

hcat

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hstrat-1.11.1.tar.gz (6.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hstrat-1.11.1-py2.py3-none-any.whl (557.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file hstrat-1.11.1.tar.gz.

File metadata

  • Download URL: hstrat-1.11.1.tar.gz
  • Upload date:
  • Size: 6.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for hstrat-1.11.1.tar.gz
Algorithm Hash digest
SHA256 1bb5b5670f3d2bcf6d1c6182c16e9de203579e9a7961f5936154b7302b8af1e7
MD5 7ede11ea50f6d0352954d503e61b985f
BLAKE2b-256 8f30c052a8912cf535e309e968eb92c0ee155f12c80d685858dc6a8a42ae61b4

See more details on using hashes here.

File details

Details for the file hstrat-1.11.1-py2.py3-none-any.whl.

File metadata

  • Download URL: hstrat-1.11.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 557.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for hstrat-1.11.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 20ff09eae81f053fecd1ff8c2e8859a52a6d3d912c9cb76d78b08cf9f3b80f14
MD5 a10f4b5d38e45404e5c5496ca9c87034
BLAKE2b-256 d8558a6bf01a549b1d0cc08be22eefb9b4da4b05b515ea9b10088eebe5bd0e04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page