Skip to main content

Calculated coverage metrics from a GATK3 Depth Of Coverage file and a bedfile

Project description


Given i) a tabix indexed per-base 'depth of coverage' file (similar to generated in GATK3) and , ii) a bed file CoverageCalculatorPy will generate four text reports:

  • .coverage file containing the mean depth of coverage across each interval in the bedfile, and the percentage of bases which meet a given depth (default is 100x) across each interval.
  • .totalcoverage file containing the the same metrics above summerised over all intervals in the given bedfile. Summeries of adittional subsets of the input bedfile can be included using --groups (see below)
  • .gaps file contains intervals which do not meet the given depth of coverage threshold
  • .missing file contains intervals which do not have a corresponding coordinate in the 'depth of coverage' file, and therefore cannot be evaluated.

Input Arguments

path to tabix indexed depth-of-coverage file

path to bedfile. Chromosomes must not be prefixed with 'chr'

depth threshold for precentage horizontal coverage calculation (default: 100)

output name to prefix on text reports (default: output)

directory to save output files to (default: current)

path to groupfile (see below)

Tabix indexing a GATK3 DepthOfCoverage file

The 'depth of coverage' file must be tabix indexed. The first three columns of the depthfile must be; chromosome, coordinate and depth. A file generated in GATK3 can be indexed as follows:

sed 's/:/\t/g' <GATK depthOfCoverage file> | grep -v 'Locus' | sort -k1,1 -k2,2n | bgzip > <filename.gz>

(on macOS)
sed "s/:/$(printf '\t')/g" <GATK depthOfCoverage file>  |  grep -v 'Locus' | sort -k1,1 -k2,2n | bgzip > <filename.gz>

tabix -b 2 -e 2 -s 1 <filename.gz> 

Adding a groupfile

The groupfile is a way of generating combined metrics across a number of intervals (i.e. combined across all exons in a gene). These metrics will appear in the .totalcoverage file. The groupfile must have a header (this will be included in the output), be a single column containing the same number of rows as the bedfile it will be analysed with.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for CoverageCalculatorPy, version 1.0.1
Filename, size File type Python version Upload date Hashes
Filename, size CoverageCalculatorPy-1.0.1-py3.7.egg (5.6 kB) File type Egg Python version 3.7 Upload date Hashes View
Filename, size CoverageCalculatorPy-1.0.1-py3-none-any.whl (5.9 kB) File type Wheel Python version py3 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page