Skip to main content

Excel report from viral sequencing analysis output

Project description

xlavir

https://img.shields.io/pypi/v/xlavir.svg https://github.com/peterk87/xlavir/workflows/CI/badge.svg?branch=master Documentation Status

Excel report from viral sequencing data analysis output from the nf-core/viralrecon or peterk87/nf-virontus Nextflow pipelines.

Features

  • Collect sample results from a nf-core/viralrecon or peterk87/nf-virontus into a Excel report
    • Samtools read mapping stats (flagstat)

    • Mosdepth read mapping coverage information

    • Variant calling information (SnpEff and SnpSift results, VCF file information)

    • Consensus sequences

  • QA/QC of sample analysis results (basic PASS/FAIL based on minimum genome coverage and depth)

  • Nextflow workflow execution information

  • Prepend worksheets from other Excel documents into the report (e.g. cover page/sheet, sample sheet, lab results)

  • Add custom images into worksheets with custom names and descriptions (e.g. phylogenetic tree figure PNG)

Roadmap

  • Bcftools variant calling stats sheet

  • Sample metadata table to merge with certain stats?

  • YAML config to info sheet?

  • coverage chart with controls?

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.6.1 (2022-02-01)

  • Added more checks for Medaka VCFs from low coverage samples which may produce ValueError and ZeroDivisionError errors

0.6.0 (2022-01-05)

  • Add support for reading annotated Medaka VCF files (medaka_variant VCF annotated with medaka tools annotate)

  • Changed mutation string format to {gene}:{AA change} ({NT change}{extra}) if there is a AA change

  • Added low coverage filtering of variants for Medaka VCF

  • “Variants Summary” table now sorted by nucleotide position

0.5.3 (2021-11-09)

  • Fixed shorter consensus sequences not being written to report

  • Improve nf-virontus VCF compatibility

0.5.2 (2021-11-08)

Fixes and changes from PR #15

Fixes:

  • low coverage coordinate output off by one (xlavir.tools.mosdepth.get_interval_coords_bed)

  • error on no Pangolin reports found (e.g. non-SARS-CoV-2 report) (xlavir.tools.pangolin.get_info)

  • user QC thresholds not being used (xlavir.xlavir.run)

  • not showing all QC fail comments (xlavir.qc.create_qc_stats_dataframe)

  • consensus sequences being too long for Excel cell character limit (32,767 characters); longer sequences are chunked into 80 character segments with one segment per line in consensus sheet (xlavir.tools.consensus.read_fasta)

Changes:

  • Ignore and skip unsupported VCFs instead of throwing NotImplementedError (xlavir.tools.variants.get_info)

  • In consensus sheet, only add QC comments on FASTA header rows if necessary (xlavir.io.xl.add_comments)

0.5.1 (2021-08-04)

  • Fixed issue (#12) where iVar ref allele depth corresponds to depth of base before deletion. For indels, ref allele depth is taken from the total depth minus the alt allele depth.

  • Fixed issue (#14) where the total number of reads from samtools flagstat may not be the true number of reads. The unmapped reads may be excluded from the BAM file so the samtools flagstat total number of reads may be equal to the number of mapped reads. There is now a search for fastp JSON files to get the true total number of reads.

0.5.0 (2021-07-30)

  • Added support for Nanopolish VCF parsing as generated by the ARTIC pipeline

  • Added deduplication of VCF and SnpSift entries since the ARTIC pipeline may produce VCF files with duplicate variant calls due to overlap between amplicons.

  • Added VCF and SnpSift test data for CLI test to generate Excel report.

0.4.3 (2021-07-29)

  • Fix an issue where single base positions are being reported as 0-based when all other ranges are 1-based for reporting of low/no coverage regions from Mosdepth per-base BED files (#10).

0.4.2 (2021-05-21)

  • Add support for nf-core/viralrecon version 2.0 (requires Mosdepth bed.gz files be output; needs custom modules.config like this one)

  • Nextclade CLI per sample results parsed into sheet showing useful info like Nextstrain clade, # of mutations, # of PCR primer changes

  • Added check that input directory exists and is a directory

  • Added sheet with xlavir info

  • Added Gene, Variant Effect, Variant Impact, Amino Acid Change to Variant Summary table

0.4.1 (2021-05-14)

  • Add reference sequence length to QC stats table. Get ref seq length from max mosdepth per base BED coverage value.

  • Add more conditional formatting

  • Fix execution_report.html finding

  • Fix version printing; add to help

  • Add epilog with usage info

0.4.0 (2021-04-23)

  • Adds “Variants Summary” sheet summarizing variant information across all samples

  • Adds comments to AF values in “Variant Matrix” sheet

  • Fixes width/height of cell comments to be based on length of comment text

0.3.0 (2021-04-23)

  • Adds support for adding Ct values from a Ct values table (tab-delimited, CSV, ODS, XLSX format) into an xlavir report.

0.2.4 (2021-04-19)

  • Fixes issue with SnpSift table file parsing and variable naming in variants.py (#4, #5)

0.2.3 (2021-04-19)

  • Fixes issue with SnpSift table file parsing. Adds check to see if SnpSift column is dtype object/str before using .str Series methods (#4)

0.2.2 (2021-03-30)

  • Fixes issue with SnpEff/SnpSift AA change parsing.

0.2.1 (2021-03-29)

  • Fix division by zero error due to variants with DP values of 0

0.2.0 (2021-03-04)

  • Added header comments with descriptions of field content

  • Added comment to Variant Matrix sheet A1 cell describing what is shown in the matrix

  • Added highlighting of samples failing QC in other sheets

  • Fixed image scaling by determining image size with imageio

  • Added Medaka / Longshot VCF parsing

0.1.1 (2021-02-16)

  • Collect sample results from a nf-core/viralrecon or peterk87/nf-virontus into a Excel report
    • Samtools read mapping stats (flagstat)

    • Mosdepth read mapping coverage information

    • Variant calling information (SnpEff and SnpSift results, VCF file information)

    • Consensus sequences

  • iVar VCF parsing

  • QA/QC of sample analysis results (basic PASS/FAIL based on minimum genome coverage and depth)

  • Nextflow workflow execution information

  • Prepend worksheets from other Excel documents into the report (e.g. cover page/sheet, sample sheet, lab results)

  • Add custom images into worksheets with custom names and descriptions (e.g. phylogenetic tree figure PNG)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlavir-0.6.2.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlavir-0.6.2-py2.py3-none-any.whl (34.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file xlavir-0.6.2.tar.gz.

File metadata

  • Download URL: xlavir-0.6.2.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for xlavir-0.6.2.tar.gz
Algorithm Hash digest
SHA256 73d19e51e1ad780b92e4a07be67198783f065cf0f5f102502a18e0c3315cac7b
MD5 77287242e6a39457ac221147ecc7c0e0
BLAKE2b-256 24ecc2a44b50b11a77ba896e0d6518aefe2007f0679d59925b92d70f8c3583a9

See more details on using hashes here.

File details

Details for the file xlavir-0.6.2-py2.py3-none-any.whl.

File metadata

  • Download URL: xlavir-0.6.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for xlavir-0.6.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f4d09d08c41b15761b1750e0668c2b13348fde9fa6ca239a3dff497ce730ea41
MD5 6ba24f262e5be20d18f26185b3584435
BLAKE2b-256 dcfd616598d6a83878f17cc9360f462b08c812b3b1ee64cb3d00aeaa31f7afb1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page