Python module to manipulate the minimap2's CS tag
Project description
cstag
cstag
is a Python library tailored for the manipulation and handling of minimap2's CS tags.
🌟 Features
cstag.call()
: Generate a CS tagcstag.shorten()
: Convert a CS tag from its long to short formatcstag.lengthen()
: Convert a CS tag from its short to long formatcstag.consensus()
: Create a consensus CS tag from multiple CS tagscstag.mask()
: Mask low-quality bases within a CS tagcstag.split()
: Break down a CS tag into its constituent partscstag.revcomp()
: Convert a CS tag to its reverse complementcstag.to_sequence()
: Reconstruct a reference subsequence from the alignmentcstag.to_vcf()
: Generate a VCF representationcstag.to_html()
: Generate an HTML representationcstag.to_pdf()
: Produce a PDF file
For comprehensive documentation, please visit our docs.
To add CS tags to SAM/BAM files, check out cstag-cli
.
🛠 Installation
Using PyPI:
pip install cstag
Using Bioconda:
conda install -c bioconda cstag
💡 Usage
Generating CS Tags
import cstag
cigar = "8M2D4M2I3N1M"
md = "2A5^AG7"
seq = "ACGTACGTACGTACG"
print(cstag.call(cigar, md, seq))
# :2*ag:5-ag:4+ac~nn3nn:1
print(cstag.call(cigar, md, seq, long=True))
# =AC*ag=TACGT-ag=ACGT+ac~nn3nn=G
Shortening or Lengthening CS Tags
import cstag
# Convert a CS tag from long to short
cs_tag = "=ACGT*ag=CGT"
print(cstag.shorten(cs_tag))
# :4*ag:3
# Convert a CS tag from short to long
cs_tag = ":4*ag:3"
cigar = "8M"
seq = "ACGTACGT"
print(cstag.lengthen(cs_tag, cigar, seq))
# =ACGT*ag=CGT
Creating a Consensus
import cstag
cs_tags = ["=ACGT", "=AC*gt=T", "=C*gt=T", "=C*gt=T", "=ACT+ccc=T"]
positions = [1, 1, 2, 2, 1]
print(cstag.consensus(cs_tags, positions))
# =AC*gt*T
Masking Low-Quality Bases
import cstag
cs_tag = "=ACGT*ac+gg-cc=T"
cigar = "5M2I2D1M"
qual = "AA!!!!AA"
phred_threshold = 10
print(cstag.mask(cs_tag, cigar, qual, phred_threshold))
# =ACNN*an+ng-cc=T
Splitting a CS Tag
import cstag
cs_tag = "=ACGT*ac+gg-cc=T"
print(cstag.split(cs_tag))
# ['=ACGT', '*ac', '+gg', '-cc', '=T']
Reverse Complement of a CS Tag
import cstag
cs_tag = "=ACGT*ac+gg-cc=T"
print(cstag.revcomp(cs_tag))
# =A-gg+cc*tg=ACGT
Reconstructing the Reference Subsequence
import cstag
cs_tag = "=AC*gt=T-gg=C+tt=A"
print(cstag.to_sequence(cs_tag))
# ACTTCTTA
Generating a VCF Report
import cstag
cs_tag = "=AC*gt=T-gg=C+tt=A"
chrom = "chr1"
pos = 1
print(cstag.to_vcf(cs_tag, chrom, pos))
"""
##fileformat=VCFv4.2
#CHROM POS ID REF ALT QUAL FILTER INFO
chr1 3 . G T . . .
chr1 4 . TGG T . . .
chr1 5 . C CTT . . .
"""
The multiple CS tags enable reporting of the variant allele frequency (VAF).
import cstag
cs_tags = ["=ACGT", "=AC*gt=T", "=C*gt=T", "=ACGT", "=AC*gt=T"]
chroms = ["chr1", "chr1", "chr1", "chr2", "chr2"]
positions = [2, 2, 3, 10, 100]
print(cstag.to_vcf(cs_tags, chroms, positions))
"""
##fileformat=VCFv4.2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=RD,Number=1,Type=Integer,Description="Depth of Ref allele">
##INFO=<ID=AD,Number=1,Type=Integer,Description="Depth of Alt allele">
##INFO=<ID=VAF,Number=1,Type=Float,Description="Variant allele frequency (AD/DP)">
#CHROM POS ID REF ALT QUAL FILTER INFO
chr1 4 . G T . . DP=3;RD=1;AD=2;VAF=0.667
chr2 102 . G T . . DP=1;RD=0;AD=1;VAF=1.0
"""
Generating an HTML Report
import cstag
from pathlib import Path
cs_tag = "=AC+ggg=T-acgt*at~gt10ag=GNNN"
description = "Example"
cs_tag_html = cstag.to_html(cs_tag, description)
Path("report.html").write_text(cs_tag_html)
# Output "report.html"
You can visualize mutations indicated by the CS tag using the generated report.html
file as shown below:
Generating a PDF Report
import cstag
cs_tag = "=AC+ggg=T-acgt*at~gt10ag=GNNN"
description = "Example"
path_out = "report.pdf"
cstag.to_pdf(cs_tag, description, path_out)
# Output "report.pdf"
You can obtain the same images of cstag.to_html
as a PDF file.
📣 Feedback and Support
For questions, bug reports, or other forms of feedback, I'd love to hear from you!
Please use GitHub Issues for all reporting purposes.
🤝 Code of Conduct
Please note that this project is released with a Contributor Code of Conduct.
By participating in this project you agree to abide by its terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.