Skip to main content

Generate isolate-specific genome masks for Mycobacterium tuberculosis

Project description

mtbmasker

mtbmasker is a Python command-line tool designed to generate isolate-specific conservative genome masks for Mycobacterium tuberculosis (MTB) genomes. This is particularly useful for downstream variant calling and phylogenomic analyses by masking problematic genomic regions (e.g., PE/PPE genes, IS elements, and other repetitive loci).


✨ Features

  • Generates genome masks per isolate using BLASTn alignment against predefined repetitive genes.
  • Supports custom isolate genome files and gene query sets.
  • Automatically formats coordinates to BED, sorts, and merges overlapping masked regions.
  • Outputs high-quality, isolate-specific .bed files for genome masking.

🧬 Use case

This tool was originally developed for comparative genomics and transmission studies of Mycobacterium tuberculosis complex (MTBC) isolates, including M. africanum. It ensures that inter-lineage diversity is respected during masking.


🔧 Installation

From GitHub (development version):

pip install git+https://github.com/EtienneNtumba/mtbmasker.git

🚀 Usage

mtbmasker mask input_list.tsv --query-fasta data/genes_to_mask.fasta

Arguments:

  • input_list.tsv — A tab-separated file with one isolate ID per line (without .fasta extension). Each ID must correspond to a ID.fasta file present in the working directory.
  • --query-fasta — Fasta file of problematic/repetitive genes to be masked (default: data/genes_to_mask.fasta).

📁 Example

input_list.tsv:

ARR1960.LR.Asm
QC-9.LR.Asm
N1177.LR.Asm

Each listed isolate must have a corresponding ARR1960.LR.Asm.fasta, etc., in the current directory.


🔬 Requirements

  • Python ≥ 3.8
  • BLAST+
  • BEDTools
  • Typer

Both BLAST and BEDTools must be installed and available in your $PATH or via a conda environment.


📄 Output

For each isolate, the following file is generated:

<isolate>_conservitive_AF19-like_masking_file.bed

This BED file contains sorted and merged coordinates of masked regions.


📚 Citation

If you use this tool in your research, please cite:

Ntumba, E., Whiley, D., et al. (2025). [Manuscript Title], Journal Name, DOI: xxx


📂 License

This tool is licensed under the MIT License.


👥 Authors

  • Etienne Ntumba Kabongo — Université de Montréal / McGill University
  • Dan Whiley — Nottingham University

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mtbmasker-0.1.0.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mtbmasker-0.1.0-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file mtbmasker-0.1.0.tar.gz.

File metadata

  • Download URL: mtbmasker-0.1.0.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for mtbmasker-0.1.0.tar.gz
Algorithm Hash digest
SHA256 801bc2c1aaa2a2ee571f5504ad4529baa252b9cf88e002ab12594436ed1a8736
MD5 15653575fd7efd31839fabe0cf022201
BLAKE2b-256 1f1e9d9ecaca6cf01169217b7560fd615d70a62f9395d2f8d465e8ad3f508e71

See more details on using hashes here.

File details

Details for the file mtbmasker-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mtbmasker-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for mtbmasker-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b668c913c8c5f0ec81291402f45bc05d4f93c25f34f796570176a63c0135325e
MD5 f8a7714a74ea06daac00aabc5bbae449
BLAKE2b-256 1c65bbe40ee4d9edf90462804e0967ae24629b07a38f69a0bf0ec27ed44d0ff4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page