Skip to main content

Working with paired reads in BAM format

Project description

888888b. 88888888888
888 "88b 888
888 .88P 888
8888888K. 8888b. 88888b.d88b. 888 888 888 88888b. .d88b. 888d888
888 "Y88b "88b 888 "888 "88b 888 888 888 888 "88b d8P Y8b 888P"
888 888 .d888888 888 888 888 888 888 888 888 888 88888888 888
888 d88P 888 888 888 888 888 888 Y88b 888 888 d88P Y8b. 888
8888888P" "Y888888 888 888 888 888 "Y88888 88888P" "Y8888 888
888 888
Y8b d88P 888
"Y88P" 888


# Overview

Ability to work out the orientation and insert size of a paired read data file
Can estimate relative orientation and gap between pairs of contigs in the bam file (Useful for scaffolding)

# Installation

use pip:

```sh
$ pip install BamTyper
```
# Usage
##As a library:
###Get the type of the reads:
#!/usr/bin/env python
from bamtyper.utilities import BamParser as BTBP
BP = BTBP()
bam_types = BP.getTypes(bamFiles)

Where:
bamFiles - a list of BAM filenames

Returns:
bam_types - a dict containing information about the insert size and
relative orientation of reads in the bam file
{ bam1 : (type, ins, stdev), ... }
Where:
type - orientation type:
0 : OUT : <--- --->
1 : SAME : ---> --->
2 : IN : ---> <---
ins - estimated insert size (of original DNA fragment)
stdev - standard deviation of insert size

###Get linking pairs:
#!/usr/bin/env python
from bamtyper.utilities import BamParser as BTBP
BP = BTBP()
(links, ref_lengths, coverages) = BP.getLinks(bamFiles, doCoverage=True)

Where:
bamFiles - a list of BAM filenames

Returns:
links - a dict containing information about links between two contigs
{c1: (c2, num_links, link_type, gap), ... }
Where:
num_links - Number of paired reads confirming the link
link_type - Relative orientation of the two contigs (Start and End)
SS, SE, ES, EE or ERROR
gap - Estimated gap between te two contigs

coverages - a dict containing the FRAGMENT coverage of each contig n the bam file(s)
{c1 : (cov1, cov2, ..., covN), ... }

ref_lengths - a dict containing the lengths of all contigs
{c1 : len, ... }

Notes:
bamtyper will automatically work out the orientation and insert size of the
reads in each bam file and base it's estimations of link_type and gap on this

##On the command line:

bamtyper type - Parse BAM files and determine reads type

$ bamtyper type bamfile.bam

Determining OT for BAM 'bamfile'
Orientation: IN Insert: 301, Stdev: 29

bamtyper links - Parse BAM files and get linking reads

Usage 1:

$ bamtyper links bamfile.bam

1.
contig2 , [ contig1 , 39 , SE , 69 ]
contig1 , [ contig2 , 39 , ES , 69 ]

implies a layout which looks like:

---1--> 69bp ---2-->

2.
contig3 , [ contig2 , 3 , SS , 58 ]
contig2 , [ contig3 , 3 , SS , 58 ] , [ contig1 , 4 , EE , 45 ]
contig1 , [ contig2 , 4 , EE , 45 ]

implies a layout which looks like:

---1--> 45bp <--2--- 58bp ---3-->

Usage 2: report FRAGMENT coverage too!

$ bamtyper links bamfile1.bam bamfile2.bam -c

contig3 , [ contig2 , 3 , SS , 58 ]
contig2 , [ contig3 , 3 , SS , 58 ] , [ contig1 , 4 , EE , 45 ]
contig1 , [ contig2 , 4 , EE , 45 ]
contig3 0.6206 0.5234
contig2 0.6558 0.0123
contig1 0.6523 0.5634

Where:
contig3 0.6206 0.5234

Reports 0.6206 fragments per base in bamfile1 and 0.5234 in bamfile2
If these were 100bp reads then this would imply
coverages of 62x and 52x respectively

# Administration

Project home page, info on the source tree, documentation, issues and how to contribute, see http://github.com/minillinim/BamTyper

This software is currently unpublished.

Copyright © 2012 Michael Imelfort. See LICENSE.txt for further details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BamTyper-0.2.6.tar.gz (26.7 kB view details)

Uploaded Source

File details

Details for the file BamTyper-0.2.6.tar.gz.

File metadata

  • Download URL: BamTyper-0.2.6.tar.gz
  • Upload date:
  • Size: 26.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for BamTyper-0.2.6.tar.gz
Algorithm Hash digest
SHA256 30cca8d36a845f6a8cd89250f859872c3690fd1c5633ac2e172f9bd2dad1e962
MD5 369d1b7ba2d64cc0db19c4549ed1059a
BLAKE2b-256 a2d285898bf6bd4b88a10e754b68d988a4bcfdbddc508cc99d1beb893567f181

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page