Basic history DAG implementation

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: GNU General Public License (GPL)
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

HistoryDAG

This package provides an implementation for a history DAG object, which compactly expresses a collection of internally labeled trees which share a common set of leaf labels.

Getting Started

HistoryDAG is on PyPI! Install with pip install historydag.

Alternatively, clone the repo and perform pip install -e historydag.

The most common input for DAG construction is collections of ete3 phylogenetic trees with full sequences at internal nodes stored in the sequence attribute. There is sample data like this in sample_data/. For example:

import historydag as hdag
import pickle

with open('sample_data/toy_trees.p', 'rb') as fh:
    ete_trees = pickle.load(fh)

# Build the DAG, specifying to only use the `sequence` attribute for node
# labels (in general, one could use other attributes as well).
dag = hdag.history_dag_from_etes(ete_trees, ['sequence'])
dag.count_histories()  # 1041

# "Complete" the DAG, adding all allowable edges.
dag.make_complete()
dag.count_histories()  # 3431531

# Show counts of trees with various parsimony scores.
dag.hamming_parsimony_count()

# "Trim" the DAG to make it only display minimum-weight trees.
dag.trim_optimal_weight()
# With default args, same as hamming_parsimony_count
dag.weight_count()  # Counter({75: 45983})

# "Collapse" the DAG, contracting zero-weight edges.
dag.convert_to_collapsed()

dag.weight_count()  # Counter({75: 1208})
dag.count_topologies()  # 1054 unique topologies, ignoring internal labels

# To count parsimony score and the number of unique nodes in each tree jointly:
node_count_funcs = hdag.utils.AddFuncDict(
    {
        "start_func": lambda n: 0,
        "edge_weight_func": lambda n1, n2: n1.label != n2.label,
        "accum_func": sum,
    },
    name="NodeCount",
)
dag.weight_count(**(node_count_funcs + hdag.utils.hamming_distance_countfuncs))
# Counter({(50, 75): 444, (51, 75): 328, (49, 75): 270, (52, 75): 94, (48, 75): 68, (53, 75): 4})

# To trim to only the trees with 48 unique node labels:
dag.trim_optimal_weight(**node_count_funcs, optimal_func=min)

# Sample a tree from the dag and make it an ete tree.
t = dag.sample().to_ete()

# the history DAG also supports indexing and iterating:
t = dag[0].to_ete()
trees = [tree for tree in dag]

# Another method for fetching all trees in the dag is provided, but the order
# will not match index order:
scrambled_trees = list(dag.get_histories())

# Union is implemented as dag merging, including with sequences of dags.
newdag = dag[0] | dag[1]
newdag = dag[0] | (dag[i] for i in range(3,5))

Highlights

History DAGs can be created with top-level functions like
- from_newick
- from_ete
- history_dag_from_newicks
- history_dag_from_etes
Trees can be extracted from the history DAG with methods like
- HistoryDag.get_histories
- HistoryDag.sample
- HistoryDag.to_ete
- HistoryDag.to_newick and HistoryDag.to_newicks
Simple history DAGs can be inspected with HistoryDag.to_graphviz
The DAG can be trimmed according to arbitrary tree weight functions. Use HistoryDag.trim_optimal_weight.
Disambiguation of sparse ambiguous labels can be done efficiently, but doesn't scale well. Use HistoryDag.explode_nodes followed by HistoryDag.trim_optimal_weight.
Weights of trees in the DAG can be counted, according to arbitrary weight functions using HistoryDag.weight_count. The class utils.AddFuncDict is provided to manage these function arguments, and implements addition so that different weights can be counted jointly. These same functions can be used in trimming.

Important Details

In order to create a history DAG from a collection of trees, each tree should meet the following criteria:

No unifurcations, including at the root node. Each node must have at least two children, unless it's a leaf node.
The label attributes used to construct history DAG labels must be unique, because history DAG nodes which represent leaves must be labeled uniquely.

Documentation

Docs are available at https://matsengrp.github.io/historydag.

To build docs, after installing requirements from requirements.txt, do make docs to build sphinx documentation locally. You'll find it at docs/_build/html/index.html.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: GNU General Public License (GPL)
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.3.1

Mar 16, 2026

1.3.0

May 3, 2024

1.2.0

Jan 12, 2024

1.1.0

Jun 8, 2023

1.0.1

Jul 6, 2022

1.0.0

May 24, 2022

0.1.3

Apr 1, 2022

0.1.2

Feb 3, 2022

0.1

Jan 31, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

historydag-1.3.1.tar.gz (140.5 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

historydag-1.3.1-py3-none-any.whl (106.5 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file historydag-1.3.1.tar.gz.

File metadata

Download URL: historydag-1.3.1.tar.gz
Upload date: Mar 16, 2026
Size: 140.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for historydag-1.3.1.tar.gz
Algorithm	Hash digest
SHA256	`f93c652756a7a39383d3a8df1e47c3ec5b820cea8e7c0caea6cd4a39484d895e`
MD5	`230d7bf498358450b6f445f1e253ac7e`
BLAKE2b-256	`4f0373291c4881f7f02a66f47f19034d6a8a22580bc4946b4b9403583383b8cd`

See more details on using hashes here.

File details

Details for the file historydag-1.3.1-py3-none-any.whl.

File metadata

Download URL: historydag-1.3.1-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 106.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for historydag-1.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fef834ca11d41c3b190b3d2f8eb71187facd0d57c2ecf305114d652b433c5d05`
MD5	`7bdf763a25aa756b464c03664f5ae664`
BLAKE2b-256	`4ecf131a5c26af7a3b2a4951814653e1a382e761337faa4ccb1e1909ec7e0590`

See more details on using hashes here.

historydag 1.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HistoryDAG

Getting Started

Highlights

Important Details

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes