Skip to main content

WGS-based CYP2D6 caller

Project description

Cyrius: WGS-based CYP2D6 genotyper

Cyrius is a tool to genotype CYP2D6 from a whole-genome sequencing (WGS) BAM file. Cyrius uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog CYP2D7 and thus is able to detect all star alleles, particularly those that contain structural variants, accurately. Please refer to our paper for details about the method.

Cyrius has been integrated into Illumina DRAGEN Bio-IT Platform since v3.7.

Running the program

This Python3 program can be run as follows:

python3 star_caller.py --manifest MANIFEST_FILE \
                       --genome [19/37/38] \
                       --prefix OUTPUT_FILE_PREFIX \
                       --outDir OUTPUT_DIRECTORY \
                       --threads NUMBER_THREADS

The manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file. For CRAM input, it’s suggested to provide the path to the reference fasta file with --reference in the command.

Interpreting the output

The program produces a .tsv file in the directory specified by --outDir.
The fields are explained below:

Fields in tsv Explanation
Sample Sample name
Genotype Genotype call
Filter Filters on the genotype call

A genotype of "None" indicates a no-call.
There are currently four possible values for the Filter column:
-PASS: a passing, confident call.
-More_than_one_possible_genotype: In rare cases, Cyrius reports two possible genotypes for which it cannot distinguish one from the other. These are different sets of star alleles that result in the same set of variants that cannot be phased with short reads, e.g. *1/*46 and *43/*45. The two possible genotypes are reported together, separated by a semicolon.
-Not_assigned_to_haplotypes: In a very small portion of samples with more than two copies of CYP2D6, Cyrius calls a set of star alleles but they can be assigned to haplotypes in more than one way. Cyrius reports the star alleles joined by underscores. For example, *1_*2_*68 is reported and the actual genotype could be *1+*68/*2, *2+*68/*1 or *1+*2/*68.
-LowQ_high_CN: In rare cases, at high copy number (>=6 copies of CYP2D6), Cyrius uses less strict approximation in calling copy numbers to account for higher noise in depth and thus the genotype call could be lower confidence than usual.

A .json file is also produced that contains more information about each sample.

Fields in json Explanation
Coverage_MAD Median absolute deviation of depth, measure of sample quality
Median_depth Sample median depth
Total_CN Total copy number of CYP2D6+CYP2D7
Total_CN_raw Raw normalized depth of CYP2D6+CYP2D7
Spacer_CN Copy number of CYP2D7 spacer region
Spacer_CN_raw Raw normalized depth of CYP2D7 spacer region
Variants_called Targeted variants called in CYP2D6
CNV_group An identifier for the sample's CNV/fusion status
Variant_raw_count Supporting reads for each variant
Raw_star_allele Raw star allele call
d67_snp_call CYP2D6 copy number call at CYP2D6/7 differentiating sites
d67_snp_raw Raw CYP2D6 copy number at CYP2D6/7 differentiating sites

Troubleshooting

Common causes for Cyrius to produce no-calls are:
-Low sequencing depth. We suggest a sequencing depth of 30x, which is the standard practice recommended by clinical genome sequencing.
-The depth of the CYP2D6/CYP2D7 region is much lower than the rest of the genome, most likely because reads are aligned to alternative contigs. If your reference genome includes alternative contigs, we suggest alt-aware alignment so that alignments to the primary assembly take precedence over alternative contigs.
-The majority of reads in CYP2D6/CYP2D7 region have a mapping quality of zero. This is probably due to some post-processing tools like bwa-postalt that modifies the mapQ in the BAM. We recommend using the BAM file before such post-processing steps as input to Cyrius.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cyrius-1.1.1.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

cyrius-1.1.1-py3-none-any.whl (221.2 kB view details)

Uploaded Python 3

File details

Details for the file cyrius-1.1.1.tar.gz.

File metadata

  • Download URL: cyrius-1.1.1.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.2

File hashes

Hashes for cyrius-1.1.1.tar.gz
Algorithm Hash digest
SHA256 18fe5ac94f0cbf0641ca76f40439d7186542bef1e9e10f0530d43c7e7fd9bb94
MD5 603c4aa162b0a36792a197acd3ce035a
BLAKE2b-256 00ce2ad108c212e4676926f2420f84111cc4f7f286158148a044bcd6c2f264aa

See more details on using hashes here.

File details

Details for the file cyrius-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: cyrius-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 221.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.2

File hashes

Hashes for cyrius-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8ed069fc21df511ef7daa55fc9ddbb7a739ddaaf6f03aa0d2fd9ca2a2eae472a
MD5 b334cae407cb7c534b475393595ea5c9
BLAKE2b-256 55b883b8fc9ad78718b417905a34380269886e1cb4a5ce7b725a08b45c9990c7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page