Skip to main content

An advanced version of TelomereHunter for Python 3, with new features

Project description

TelomereHunter2

PyPI version License: GPL v3 Build Status Python Versions DOI Last Commit Docker Pulls

TelomereHunter2 is a Python-based tool for estimating telomere content and analyzing telomeric variant repeats (TVRs) from genome sequencing data. It supports BAM/CRAM files, flexible telomere repeat and reference genome inputs, and provides outputs for bulk and single-cell genome sequencing data.


Release Notes

See RELEASE_NOTES.md for the latest changes and version history.


New Features

  • Fast, container-friendly Python 3 implementation
  • Parallelization and algorithmic steps for drastic speedup
  • Supports BAM/CRAM, custom telomeric repeats, and now also non-human genomes
  • Static and interactive HTML reports (Plotly)
  • Docker and Apptainer/Singularity containers
  • Single cell sequencing support (e.g. scATAC-seq; barcode splitting and per-cell analysis)
  • Robust input handling and exception management
  • Fast mode for quick overview of unmapped reads

Installation

Classic setup:

pip install telomerehunter2

From source:

# With pip:
git clone https://github.com/ferdinand-popp/telomerehunter2.git
cd telomerehunter2
python -m venv venv
source venv/bin/activate
pip install -e . --no-cache-dir

# With uv:
git clone https://github.com/ferdinand-popp/telomerehunter2.git
cd telomerehunter2
uv pip install -e . --no-cache-dir

Container usage:
See Container Usage for Docker/Apptainer instructions.

Operating systems:
Currently tested on Linux and macOS. Windows support via WSL2 and Docker not completely tested (WIP check GitHub Issues)

Usage Bulk vs single cell Analysis

Bulk Analysis

telomerehunter2 -ibt TUMOR_FILE -ibc CONTROL_FILE -o OUTPUT_DIRECTORY -p ID_OF_SAMPLE -b BANDING_FILE [options]
  • Single sample:
    telomerehunter2 -ibt sample.bam -o results/ -p SampleID -b telomerehunter2/cytoband_files/hg19_cytoBand.txt
  • Tumor vs Control:
    telomerehunter2 -ibt sample.bam -ibc control.bam -o results/ -p PairID -b telomerehunter2/cytoband_files/hg19_cytoBand.txt
  • Custom repeats/species:
    telomerehunter2 ... --repeats TTTAGGG TTAAGGG --repeatsContext TTAAGGG
  • Fast mode (quick overview of unmapped reads generating summary with overview):
    telomerehunter2 -ibt sample.bam -o results/ -p SampleID --fast_mode

Single cell sequencing Analysis

TelomereHunter2 now supports direct single-cell BAM analysis (with CB barcode tag). Simply run:

telomerehunter2_sc -ibt sample.bam -o results/ -p SampleID -b telomerehunter2/cytoband_files/cytoband.txt --min-reads-per-barcode 10000

This will perform barcode-aware telomere analysis and output per-cell results in a summary file. The minimum reads per barcode threshold can be set with --min-reads-per-barcode. To rerun postprocessing with adjusted --min-reads-per-barcode threshold run command again with --noFiltering to skip the expensive filtering step from all reads to telomeric reads. If the reads have a different barcode tag than CB, use --barcodeTag to set the correct one. More information on correcting chromatin state for scATAC follows in (Engel et al., 2024).

See tests/test_telomerehunter2_sc.py for example usage and validation.

Usage full list of option

telomerehunter2 --help

Input & Output

Input:

  • BAM/CRAM files (aligned reads, <-ibt> for tumor, <-ibc> for control)
  • Cytoband file (tab-delimited, e.g. telomerehunter2/cytoband_files/hg19_cytoBand.txt, <-b>)
  • Identifier for sample/pair (<-p>)
  • Optional: custom telomeric repeats

Output:

  • summary.tsv, TVR_top_contexts.tsv, singletons.tsv
  • Plots (plots/ directory, PNG/HTML)
  • Logs (run status/errors)
  • For sc-seq: Additionally to the complete bulk run you get per-cell results in sc_summary.tsv and barcode_counts.tsv with reads counts per barcode

Explanation of summary.tsv

Column Value example Description
PID TEST_PATIENT Sample name
sample tumor Sample classification (tumor (single), control, log2(t/c))
tel_content 1.8 Intratelomeric reads / reads in GC correction range * 1e6
total_reads 120 Number of reads in the input file
read_lengths 25,36,42,54 Unique lengths of reads
repeat_threshold_set 6 per 100 bp Telomeric repeat threshold set
repeat_threshold_used 4 Repeats threshold applied based on avg. read length
intratelomeric_reads 4 Filtered Tel reads in unmapped reads
junctionspanning_reads 0 Filtered Tel reads spanning junctions into first/last band
subtelomeric_reads 6 Filtered Tel reads in subtelomeric regions (first/last band)
intrachromosomal_reads 0 Filtered Tel reads in intrachromosomal regions
tel_read_count 10 Total telomeric reads identified
gc_bins_for_correction 48-52 GC content range used for normalization of reads
total_reads_with_tel_gc 8 Total reads within GC bin for normalization
TCAGGG_arbitrary_context_norm_by_intratel_reads 1.5 Telomeric variant repeat count normalized by intratelomeric reads
... ... ...
TCAGGG_singletons_norm_by_all_reads 0.0 Singleton (TVR flanked by canonicals) count normalized by all reads
... ... ...

Dependencies

  • Python >=3.6
  • pysam, numpy, pandas, plotly, PyPDF2
  • For static image export: kaleido (requires chrome/chromium)
  • Docker/Apptainer (optional)

Install all dependencies:

pip install -r requirements.txt

Container Usage

Docker (recommended):

Build locally:

docker build -t telomerehunter2 .
docker run --rm -it -v /data:/data telomerehunter2 telomerehunter2 -ibt /data/sample.bam -o /data/results -p SampleID -b /data/hg19_cytoBand.txt

Pull from Docker Hub:

docker pull fpopp22/telomerehunter2

Run from Docker Hub:

docker run --rm -it -v /data:/data fpopp22/telomerehunter2 telomerehunter2 -ibt /data/sample.bam -o /data/results -p SampleID -b /data/hg19_cytoBand.txt

Apptainer/Singularity:

Build locally:

apptainer build telomerehunter2.sif Apptainer_TH2.def
# mount data needed
apptainer run telomerehunter2.sif telomerehunter2 -ibt /data/sample.bam -o /data/results -p SampleID -b /data/hg19_cytoBand.txt

Pull from Docker Hub (as Apptainer image):

apptainer pull docker://fpopp22/telomerehunter2:latest
apptainer run telomerehunter2_latest.sif telomerehunter2 ...

Troubleshooting

  • Memory errors: Use more RAM or limit cores used with -c flag.
  • Missing dependencies: Check requirements.txt.
  • Banding file missing: Needs reference genome banding file -b otherwise analysis will run without reads mapped to subtelomeres.
  • Plotting: Try disabling with --plotNone or use plotting only mode with --plotNone.
  • Minor changes to TH1: Skipping the tvrs normalization per 100 bp, improved detection of GXXGGG TVRs, read lengths are estimated from first 1000 reads, added TRPM

For help: GitHub Issues or our FAQ.

Documentation & Resources

Citation

If you use TelomereHunter2, please cite:

  • Feuerbach, L., et al. "TelomereHunter – in silico estimation of telomere content and composition from cancer genomes." BMC Bioinformatics 20, 272 (2019). https://doi.org/10.1186/s12859-019-2851-0
  • Application Note for TH2 (in preparation).

Contributing

Fork, branch, and submit pull requests. Please add tests and follow code style. For major changes, open an issue first. Before submitting, please install the tox package and run the following checks:

  1. Run Unit Tests and Style Checks:
    tox
    

License

GNU General Public License v3.0. See LICENSE.

Contact

Acknowledgements

Developed by Ferdinand Popp, Lina Sieverling, Philip Ginsbach, Lars Feuerbach. Supported by German Cancer Research Center (DKFZ) - Division Applied Bioinformatics.


Copyright 2025 Ferdinand Popp, Lina Sieverling, Philip Ginsbach, Lars Feuerbach

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telomerehunter2-1.0.5.tar.gz (126.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

telomerehunter2-1.0.5-py3-none-any.whl (118.3 kB view details)

Uploaded Python 3

File details

Details for the file telomerehunter2-1.0.5.tar.gz.

File metadata

  • Download URL: telomerehunter2-1.0.5.tar.gz
  • Upload date:
  • Size: 126.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for telomerehunter2-1.0.5.tar.gz
Algorithm Hash digest
SHA256 587f2ad6775c31c06842432b1b49726deb74132b518b1b14a8e143fbafc9f3d1
MD5 92f52c757b278eeb095cc28b5513b8c8
BLAKE2b-256 9febab508dc789b4916d8a63861ff21c229a4053cb0317c1df50b4f0a7e441db

See more details on using hashes here.

File details

Details for the file telomerehunter2-1.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for telomerehunter2-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 17d901b3ece0ab4c9c0511199d85d82da4f39268814af84b1bb1e658ecf6877d
MD5 46ff8939b87b61656df2f57a825651e2
BLAKE2b-256 eb007ea963c94efe9b6605d089b0a2924309f04012f3f28188a8c0c9140c5c4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page