Skip to main content

Parses outputs of different QC tools and unifies them for the SMaHT portal

Project description

QC parser for SMaHT

Parses outputs of different QC tools and unifies them for the SMaHT portal

Installation

Simply run pip install qc-parser to install the package. You need at least Python 3.8.

To develop this package, clone this repo, make sure poetry is installed on your system and run make install.

Usage

After installation the following command can be run from the command line:

parse-qc \
    -n 'BAM Quality Metrics' \
    --metrics samtools /PATH/samtools.stats.txt \
    --metrics picard_CollectInsertSizeMetrics /PATH/picard_cis_metrics.txt \
    --additional-files /PATH/additional_output_1.pdf \
    --additional-files /PATH/additional_output_2.tsv \
    --output-zip metrics.zip
    --output-json qc_values.json

In this example, the tool will parse the Samtools output file /PATH/samtools.stats.txt and the Picard output file /PATH/picard_cis_metrics.txt. The values that are extracted from both files are specified in src/metrics_to_extract.py. All metrics are combined and stored in qc_values.json that is compatible with Tibanna_ff's generic QC functionality.

The metrics.zip will contain the following files:

samtools.stats.txt
picard_cis_metrics.txt
additional_output_1.pdf
additional_output_2.tsv

The currently supported QC tools are:

  • samtools_stats (Samtools stats)
  • picard_CollectAlignmentSummaryMetrics (Picard CollectAlignmentSummaryMetrics)
  • picard_CollectInsertSizeMetrics (Picard CollectInsertSizeMetrics)
  • picard_CollectWgsMetrics (Picard CollectWgsMetrics)
  • bamstats (bamStats.py)
  • fastqc (FastQC)
  • rnaseqc (RNA-SeQC)
  • nanoplot (NanoPlot)
  • verifybamid2 (VerifyBamID2)
  • kraken2 (Kraken2)

Development

If you want to extract a new metric from an already supported QC tool, add the metric to the src/metrics_to_extract.py in the appropriate section.

If you want to add suuport for a new QC tool, you need to add a parser to src/MetricsParser.py and add the metrics you want to extract from the tool to src/metrics_to_extract.py.

Tests

The command make test will run local tests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qc_parser-0.4.0.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

qc_parser-0.4.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file qc_parser-0.4.0.tar.gz.

File metadata

  • Download URL: qc_parser-0.4.0.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.8.13 Darwin/23.5.0

File hashes

Hashes for qc_parser-0.4.0.tar.gz
Algorithm Hash digest
SHA256 5060805cdbe502f827d88ee91e965de56a096eee12b81cf1a96584d011f4b441
MD5 7e98a3dfe5948222e939953889b07cf3
BLAKE2b-256 63b19952cfc8f45ca53cbb6d688ff5efa11439d21662a868e5f8df6764126e5f

See more details on using hashes here.

File details

Details for the file qc_parser-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: qc_parser-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.8.13 Darwin/23.5.0

File hashes

Hashes for qc_parser-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 68ce488feff7828a4b749cf761d73709544b90260627d8afcdbf82a37f6f57aa
MD5 56801899433d40454e84b71dc5cff25a
BLAKE2b-256 c6076be4ba56c890ea6d808b50b2247ac8a952d51e6c44f177cdc2ea71b18456

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page