Skip to main content

A nCoV package for parsing analysis files

Project description

ncov_parser

License: MIT

The ncov_parser package provides a suite of tools to parse the files generated in the Nextflow workflow and provide a QC summary file. The package requires several files including:

  • .variants.tsv
  • .variants.norm.vcf
  • .pass.vcf
  • .per_base_coverage.bed
  • .primertrimmed.consensus.fa
  • .consensus.fasta
  • alleles.tsv

An optional metadata file with qPCR ct and collection date values can be included.

In addition, bedtools should be run to generate a <sample>.per_base_coverage.bed file to generate mean and median depth of coverage statistics.

Installation

After downloading the repository, the package can be installed using pip:

git clone https://github.com/jts/ncov-tools
cd ncov-tools/parser
pip install .

Usage

The library consists of several functions that can be imported.

import ncov.parser

Several classes are available representing the different files that can be processed.

ncov.parser.Alleles
ncov.parser.Consensus
ncov.parser.Lineage
ncov.parser.Meta
ncov.parser.PerBaseCoverage
ncov.parser.Snpeff
ncov.parser.Variants
ncov.parser.Vcf
ncov.parser.primers

Similarly, wrapper scripts for creating a standard format output can be found in ncov.parser.qc

import ncov.parser.qc as qc
qc.write_qc_summary_header()
qc.write_qc_summary()

Top levels scripts

In the bin directory, several wrapper scripts exist to assist in generating QC metrics.

To create sample level summary qc files, use the get_qc.py script:

get_qc.py --variants <sample>.variants.tsv or <sample>.pass.vcf
--coverage <sample>.per_base_coverage.bed --meta <metadata>.tsv
--consensus <sample>.primertrimmed.consensus.fa [--indel] --sample <samplename>
--platform <illumina or oxford-nanopore> --run_name <run_name> --alleles alleles.tsv
--indel --lineage <Pangolin lineage report> --aa_table <SNPEff annotation table>

Note the --indel flag should only be present if indels will be used in the calculation of variants.

Once this is complete, we can use the collect_qc_summary.py script to aggregate the sample level summary files into a single run tab-separate file.

collect_qc_summary.py --path <path to sample.summary.qc.tsv files>

To create an amplicon BED file from a primer scheme BED file:

primers_to_amplicons.py --primers <path to primer scheme BED file>
--offset <number of bases to offset> --bed_type <full or no_primers or unique_amplicons>
--output <full path to file to write BED data to>

Credit and Acknowledgements

Note that this tool has been used in conjunction with the @jts ncov-tools suite of tools.

BED file importing and amplicon site merging obtained from the ARTIC pipeline: https://github.com/artic-network/fieldbioinformatics/blob/master/artic/vcftagprimersites.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncov_parser-0.7.0.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

ncov_parser-0.7.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file ncov_parser-0.7.0.tar.gz.

File metadata

  • Download URL: ncov_parser-0.7.0.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.2 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.12

File hashes

Hashes for ncov_parser-0.7.0.tar.gz
Algorithm Hash digest
SHA256 6edc4363650701e8a8de082b644451dddad330512975727a0ae880f60b8380a1
MD5 a32e8849f9a2b5d320c5705f4c07a9ff
BLAKE2b-256 f913fe802bc5e0fb481b7ed3377c1ae41f021decba06be5a116107868a944775

See more details on using hashes here.

File details

Details for the file ncov_parser-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: ncov_parser-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.2 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.12

File hashes

Hashes for ncov_parser-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f1624a63130ffcd8f7a2b96c4f2dcd5b43ca18d8e5c1fc48e2537821d0d4d0b9
MD5 0281629f7db000533077814158b253f1
BLAKE2b-256 0451a595295f2175bc7e0eb33f8b82330b326d77d9841839d70fdbfda6903667

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page