Skip to main content

Generate phylogenetic datasets with minimal setup effort

Project description


TreeSimulator Remaster AliSim

Phylogenie is a Python package designed to easily simulate phylogenetic datasets—such as trees and multiple sequence alignments (MSAs)—with minimal setup effort. Simply specify the distributions from which your parameters should be sampled, and Phylogenie will handle the rest!

✨ Features

Phylogenie comes packed with useful features, including:

  • Simulate tree and multiple sequence alignment (MSA) datasets from parameter distributions 🌳🧬
    Define distributions over your parameters and sample a different combination of parameters for each dataset sample.

  • Automatic metadata management 🗂️
    Phylogenie stores each parameter combination sampled during dataset generation in a .csv file.

  • Generalizable configurations 🔄
    Easily apply the same configuration across multiple dataset splits (e.g., train, validation, test).

  • Multiprocessing support ⚙️💻
    Simply specify the number of cores to use, and Phylogenie handles multiprocessing automatically.

  • Pre-implemented parameterizations 🎯
    Include canonical, fossilized birth-death, epidemiological, birth-death with exposed-infectious (BDEI), contact-tracing (CT), and more.

  • Skyline parameter support 🪜
    Support for piece-wise constant parameters.

  • Arithmetic operations on parameters 🧮
    Perform flexible arithmetic operations between parameters directly within the config file.

  • Support for common phylogenetic simulation tools 🛠️
    Compatible backends include ReMASTER, TreeSimulator, and AliSim.

  • Modular and extendible architecture 🧩
    Easily add new simulation backends as needed.

📦 Installation

Phylogenie requires Python 3.10 to be installed on your system. There are several ways to install Python and managing different Python versions. One popular option is to use pyenv.

Once you have Python set up, you can install Phylogenie directly from PyPI:

pip install phylogenie

Or install from source:

git clone https://github.com/gabriele-marino/phylogenie.git
cd phylogenie
pip install .

🛠 Backend dependencies

Phylogenie works with the following simulation backends:

  • TreeSimulator
    A Python package for simulating phylogenetic trees. It is automatically installed with Phylogenie, so you can use it right away.

  • ReMASTER
    A BEAST2 package designed for tree simulation. To use ReMASTER as a backend, you need to install it separately.

  • AliSim
    A tool for simulating multiple sequence alignments (MSAs). It is distributed with IQ-TREE and also requires separate installation if you wish to use it as a backend.

🚀 Quick Start

Once you have installed Phylogenie, check out the examples folder.
It includes a collection of thoroughly commented configuration files, organized as a step-by-step tutorial. These examples will help you understand how to use Phylogenie in practice and can be easily adapted to fit your own workflow.

For quick start, pick your favorite config file and run Phylogenie with:

phylogenie examples/<config_file>.yaml

This command will create the output dataset in the folder specified inside the configuration file, including data directories and metadata files for each dataset split defined in the config.

Tip: Can’t choose just one config file? You can run them all at once by pointing Phylogenie to the folder! Just use: phylogenie examples. In this mode, Phylogenie will automatically find all .yaml files in the folder you specified and run for each of them!

📖 Documentation

  • The examples folder contains many ready-to-use, extensively commented configuration files that serve as a step-by-step tutorial to guide you through using Phylogenie. You can explore them to learn how it works or adapt them directly to your own workflows.
  • A complete user guide and API reference are under development. In the meantime, feel free to reach out if you have any questions about integrating Phylogenie into your workflows.

📄 License

This project is licensed under MIT License.

📫 Contact

For questions, bug reports, or feature requests, please, consider opening an issue on GitHub, or contact me directly.

If you need help with the configuration files, feel free to reach out — I am always very available and happy to assist!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylogenie-1.0.3.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phylogenie-1.0.3-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file phylogenie-1.0.3.tar.gz.

File metadata

  • Download URL: phylogenie-1.0.3.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/24.3.0

File hashes

Hashes for phylogenie-1.0.3.tar.gz
Algorithm Hash digest
SHA256 2ededc296830687f18a2ee3cba8175ff9c8a8fdf360cc71863b3f5cb4286d4de
MD5 bc9f74a14a5f97ca40fd8a6ea82126b6
BLAKE2b-256 8e3d1bc25f490254172b094226b979f50b3218f35af925e6ff985c652475ebd1

See more details on using hashes here.

File details

Details for the file phylogenie-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: phylogenie-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 29.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Darwin/24.3.0

File hashes

Hashes for phylogenie-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1d2adcb6de85f91acbbbeaf96be2fe6bc1d9fc1b67205611ea2e5328a698b6fb
MD5 3db3838f3b9fc9b69f05301e0aacdd20
BLAKE2b-256 0b129da790cef18a72b2e7ec832feb69c6af61d45b7b852783cb117e7f8da848

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page