Skip to main content

A package for filtering candidate mutations for spontaneous mutation rate estimates.

Project description

Camu Logo

Installation

CaMu.py can easily be installed using pip. Just run the command:

pip install camu

pip will automatically install the required versions of the python packages that are included within CaMu.py.

Additionally, it is necessary to have samtools >= 0.1.19 and gatk>=4.1.2 installed and added to the path.

Usage

After installing CaMu.py, the main module camu can be run via

python3 -m camu <additional parameter>

Giving -h or --help as additional parameter opens the help page.

Here you can see all additional parameters you need to pass in order to run camu.

For the overall script it is necessary to provide a text file giving the paths all samples VCF files in column 1 and the corresponding BAM files in column 2 with -i.

Additionally, the path to the control BAM file has to be provided via -c.

Finally, the path to the reference genome has to be given with -r.

If you want to run any script separately, you can call the script using

python3 -m <scriptname>

where <scriptname> has to be exchanged by one of the 5 modules given below (preprocessing.py, filterDupAndLinked.py, etc.).

Filtering false-positive candidate mutations to accelerate DNM-counting for direct µ estimates

For direct estimation of the spontaneous mutation rate µ, it is necessary to calculate the rate of spontaneous de-novo mutations (DNM) occuring per site per generation. Consequently, counting DNM is essential for estimating µ.

The raw approach is:

  • Sequencing samples and control --> FASTQ files
  • Assembly of sequencing results --> BAM files
  • perform some filtering steps
  • Variant calling
  • extraction of variants occurring in samples but not in control --> candidate mutations

The resulting list of candidate mutations (CM) currently has to be manually curated using a genome browser like IGV.

Unfortunately, approx. 90 % of these CM are no true DNM, they turn out to be false-positives.

CaMu.py aims to accelerate the whole procedure of DNM counting by filtering out the vast majority of false-positive CM and by preparing the remaining CM for fast manual curation with IGV.

CaMu.py consists of 5 main Python modules:

  1. preprocessing.py
  2. filterDupAndLinked.py
  3. detectFIO.py
  4. snapshotIGV.py
  5. IGVsessions.py

preprocessing.py starts with an input text file containing paths to all sampels' VCF files in column 1 and paths to all corresponding BAM files in column 2. These VCF and BAM files are fread out nd further processed in order to find variants that are possible DNM - the candidate mutations CM.

The rough is approach is the following:

Preprocessing pipeline

The following scripts within CaMu.py further filter the CM for those that fully linked to other mutations, those that are only included due to reads that are most probably PCR duplicates and for those variants occurring in other samples or several times in the control sample's BAM file.

Finally, for all the remaining CM IGV Sessions and IGV snapshots are created within IGVsessions.py and snapshotIGV.py to further simplify the manual curation of the remaining CM.

Here is an overview:

CaMu overview

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

camu-0.1.7.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

camu-0.1.7-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file camu-0.1.7.tar.gz.

File metadata

  • Download URL: camu-0.1.7.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for camu-0.1.7.tar.gz
Algorithm Hash digest
SHA256 62983727ba60dab865244c9b304b3abb066f87acd8fe3f2cee93e9ff80c9a67f
MD5 f012077e80571960df06b47532e388df
BLAKE2b-256 707659a63ff0e5d8508d4e8abc052e5674cb4ac48229c6f2a93805307f39c3f0

See more details on using hashes here.

File details

Details for the file camu-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: camu-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for camu-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 82ace2d120fba317b789c725e69f1391b08640b924f620d6d0d5a99fdfd9650b
MD5 1f24a2dac8396ff62ee747222afc0394
BLAKE2b-256 4c7cdfb8ba29e9a6e6ab0622d0c332cdb861ef8fe97a14f55314c37e0ed2616b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page