Skip to main content

OHDLF-pro: a pipeline designed to filter and address orthologous gene heterogeneity, duplication, and loss

Project description

OHDLF-pro

Introduction

OHDLF-pro is the advanced evolution of the OHDLF pipeline, designed for high-performance phylogenomic analysis. Building upon OrthoFinder outputs, it automates the screening, alignment, and phylogenetic tree construction process with enhanced speed and accuracy.

Compared to the standard OHDLF, the Pro version introduces critical upgrades:

  1. ⚡ Multi-threading Support: New parallel computing architecture (via the -t parameter) significantly accelerates BLAST filtering, alignment, and tree building steps.

  2. ✨ Visual Progress Tracking: Integrated dynamic animations (progress bars) provide real-time feedback on current tasks and estimated completion time.

  3. 🛡️ Robust Coalescence (Type 2): Now integrates DISCO (Decomposition of Species Trees) into the Coalescence workflow. This allows for the robust handling of multi-copy gene families, decomposing complex gene trees into single-copy orthologs to yield more statistically reliable species trees compared to simple filtering.

dependencies

  • Biopython
  • tqdm
  • treeswift External Bioinformatics Tools (conda Recommended)
conda install -c bioconda iqtree aster blast raxml mafft

users can download the OHDLF.yaml file to directly configure the environment.

Install

pip install OHDLF-pro

Quick Start

Concatenation:
OHDLF-pro -l 0.05 -d 6 -s 97 -p 1 -t 8
Coalescence:
OHDLF-pro -l 0 -d 6 -s 97 -p 2 -t 8

Usage

OHDLF-pro.py -l [LOSS] -d [DUP] -s [SIM] -p [TYPE] -t [THREADS]


Commands:
  -l / --loss Allowable the max missing rate of gene. This option is required.
  -d / --duplication Allowable the max duplication number of gene. This option is required.
  -s / --similarity Allowable the similarity threshold of gene. If you do not set this parameter, the program will use '97' by default
  -p / --process_type process_type: 1 for Concatenation, 2 for Coalescence
  -t / --threads number of allowed threads

Input

Input :You need to 'cd' to the Orthofinder output directory named 'Results_XXX'. The software depends on two directories: 'Orthogroup_Sequences' and 'Orthogroups'.

Output

Type 1 (Concatenation)

  • final_OrthologsAlign_GDL.phy: The concatenated alignment (phy file).

  • RAxML_bestTree.OHDLF_tree: The final Maximum Likelihood tree.

Type 2 (Coalescence with DISCO)

  • GDL_Orthologue_Sequences_iqtree: Individual gene trees inferred by IQ-TREE.

  • GDL_Orthologue_Sequences_DISCO: Decomposed gene trees processed by DISCO (Multi-copy -> Single-copy).

  • all_disco.trees: The combined input file for ASTRAL.

  • OHDLF_DISCO_ASTRAL.nwk: The final, robust species tree.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ohdlf_pro-1.0.0.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ohdlf_pro-1.0.0-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file ohdlf_pro-1.0.0.tar.gz.

File metadata

  • Download URL: ohdlf_pro-1.0.0.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for ohdlf_pro-1.0.0.tar.gz
Algorithm Hash digest
SHA256 89dc135f60666b0564b25a07bcc27670d0a12f9d09f91d121d09209a9fa564ba
MD5 8448b41e2395af350af36308856f0cca
BLAKE2b-256 021f655c5867aa567a679c60abfbca5009fab60045df912535a5c6ab806a9c46

See more details on using hashes here.

File details

Details for the file ohdlf_pro-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: ohdlf_pro-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for ohdlf_pro-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ce5964fe7a0388ab142612c668070682b092af6c2a5bc372d35623bd672e8272
MD5 e2c1835be364e8aac273e6fd93d9f647
BLAKE2b-256 bfaab0d76a8673783192bbca3f067d2be15eda33d36bfceb9ed877be9bbc60ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page