Skip to main content

OHDLF-pro: a pipeline designed to filter and address orthologous gene heterogeneity, duplication, and loss

Project description

OHDLF-pro

Introduction

OHDLF-pro is the advanced evolution of the OHDLF pipeline, designed for high-performance phylogenomic analysis. Building upon OrthoFinder outputs, it automates the screening, alignment, and phylogenetic tree construction process with enhanced speed and accuracy.

Compared to the standard OHDLF, the Pro version introduces critical upgrades:

  1. ⚡ Multi-threading Support: New parallel computing architecture (via the -t parameter) significantly accelerates BLAST filtering, alignment, and tree building steps.

  2. ✨ Visual Progress Tracking: Integrated dynamic animations (progress bars) provide real-time feedback on current tasks and estimated completion time.

  3. 🛡️ Robust Coalescence (Type 2): Now integrates DISCO (Decomposition of Species Trees) into the Coalescence workflow. This allows for the robust handling of multi-copy gene families, decomposing complex gene trees into single-copy orthologs to yield more statistically reliable species trees compared to simple filtering.

dependencies

  • Biopython
  • tqdm
  • treeswift External Bioinformatics Tools (conda Recommended)
conda install -c bioconda iqtree aster blast raxml mafft

users can download the OHDLF.yaml file to directly configure the environment.

Install

pip install OHDLF-pro

Quick Start

Concatenation:
OHDLF-pro -l 0.05 -d 6 -s 97 -p 1 -t 8
Coalescence:
OHDLF-pro -l 0 -d 6 -s 97 -p 2 -t 8

Usage

OHDLF-pro.py -l [LOSS] -d [DUP] -s [SIM] -p [TYPE] -t [THREADS]


Commands:
  -l / --loss Allowable the max missing rate of gene. This option is required.
  -d / --duplication Allowable the max duplication number of gene. This option is required.
  -s / --similarity Allowable the similarity threshold of gene. If you do not set this parameter, the program will use '97' by default
  -p / --process_type process_type: 1 for Concatenation, 2 for Coalescence
  -t / --threads number of allowed threads

Input

Input :You need to 'cd' to the Orthofinder output directory named 'Results_XXX'. The software depends on two directories: 'Orthogroup_Sequences' and 'Orthogroups'.

Output

Type 1 (Concatenation)

  • final_OrthologsAlign_GDL.phy: The concatenated alignment (phy file).

  • RAxML_bestTree.OHDLF_tree: The final Maximum Likelihood tree.

Type 2 (Coalescence with DISCO)

  • GDL_Orthologue_Sequences_iqtree: Individual gene trees inferred by IQ-TREE.

  • GDL_Orthologue_Sequences_DISCO: Decomposed gene trees processed by DISCO (Multi-copy -> Single-copy).

  • all_disco.trees: The combined input file for ASTRAL.

  • OHDLF_DISCO_ASTRAL.nwk: The final, robust species tree.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ohdlf_pro-1.1.0-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file ohdlf_pro-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: ohdlf_pro-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for ohdlf_pro-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 222ac55a05e1e26b3c0e9499ecdd99ec5405d505084ca764eae823eb0c9cc0b7
MD5 066ffda61e94f50540347d92a391237d
BLAKE2b-256 3329dadb7556cf406e157fafd23d495fe3ac321fd61c24f36f4c008146fd200e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page