Skip to main content

Fast detection of recombinant reads in BAMs

Project description

Readcomb - fast detection of recombinant reads in BAMs

What is it?

Readcomb is a hybrid command line script and Python module for the filtering and classification of bam sequences based on their phase change properties.

Installation

pip install readcomb

Dependencies

  • cyvcf2 - Fast retrieval and filtering of vcf files and vcf objects written in C
  • pysam - Interface for SAM and BAM files and provides SAM and BAM objects
  • pandas - Support for data tables
  • tqdm - Provides updating progress bars for command line programs
  • samtools

Usage:

bamprep

Command line preprocessing script for bam files


Optional parameters:

vcfprep

Command line preprocessing script for vcf files

readcomb-vcfprep --vcf [vcf_filepath] --out [output_filepath]

Optional arguments

  • --snps_only, Keep only SNPs
  • --indels_only, Keep only indels
  • --no_hets, Remove heterozygote calls
  • --min_GQ [quality], Minimum genotype quality at both sites (default is 30)

filter

Command line multiprocessing script for identification of bam sequences with phase changes

readcomb-filter --bam [bam_filepath] --vcf [vcf_filepath]

Optional arguments:

  • -p, --processes [processes], Number of processes available for filter (default is 4)
  • -m, --mode [phase_change|no_match], Filtering mode (default phase_change)
  • -l, --log [log_filepath], Filename for log metric output
  • -o, --out [output_filepath], File to write filtered output to (default recomb_diagnosis)

classification

Python module for detailed classification of sequences containing phase changes

from readcomb.classification import rc

# generate list of bam read pairs
pairs = rc.pairs_creation(bam_filepath, vcf_filepath)

# call each of the pairs to analyse and classify them
# map and lambda function
map(lambda x:x.call(), pairs)

# or use a for loop
for pair in pairs:
    pair.call() 

# get classification of first read pair
pairs[0].classify
# > gene_conversion

License

GNU General Public License v3 (GPLv3+)

Development

Currently in alpha

Source code

Development repo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

readcomb-0.0.2.tar.gz (12.0 kB view hashes)

Uploaded Source

Built Distribution

readcomb-0.0.2-py3-none-any.whl (26.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page