Skip to main content

Encrypted filtering of germline calls from a somatic vcf.

Project description

Germline Filter is a Python program used in the ICGC-TCGA DREAM Mutation Calling Challenge meant to contribute to the real data security measures. It takes as input a preprocessed and encrypted germline calls file, as generated by GATK, and a somatic SNV vcf file, and it returns the number of germline calls found in the somatic vcf.

Features:

  • The most important feature of the GermlineFilter is that the program runs in an encrypted fashion, making it safe for running on any server. All the filtering steps described in the flowchart are done at runtime, and at no point data is written on the disk. It has three options:
    • encrypt_germline_vcf - Encrypt a truth germline vcf (preprocessing steps in the workflow above)
    • filter - Filter a somatic vcf against an encrypted truth germline vcf. This step is done in an encrypted fashion.
    • get_germline_positions - Get the actual germline positions called in a somatic vcf. This step is done in an unencrypted fashion, against the original truth germline vcf. It should only be run locally or on an encrypted server. The output is written to a tab delimited file.
  • Multiple germline vcf’s can be preprocessed at the same time, with a common salt file and key file.
  • Multiple somatic vcf’s corresponding to the same encrypted truth germline file can be filtered simultaneously. This considerably increases the speed versus individual runs.
  • The user can choose the encryption protocol (AES, Blowfish); default AES
  • The user can choose the hashing protocol (md5 or sha512); default sha512
  • Get the actual germline position in a vcf, for plotting and further analysis.

Usage:

After installation, to find out how to use the Germline filter, run:

germline_filter --help

For more examples, please take a look at the user manual, located in <path-to-dir>/GermlineFilter-1.2/doc

Project details


Release history Release notifications

This version
History Node

1.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
GermlineFilter-1.2.tar.gz (102.3 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page