Detect interesting SARS-CoV-2 spike protein variants from Sanger sequencing data.
Project description
Detect interesting SARS-CoV-2 spike protein mutations from Sanger sequencing data
covid-spike-classification
is a script to call interesting SARS-CoV-2 spike protein mutations
from Sanger sequencing to support the Danish COVID-19 monitoring efforts.
Using Sanger-sequenced RT-PCR product of the spike protein, this tool should pick up all relevant
mutations currently tracked (see covid_spike_classification/core.py
for the full list of tracked mutations) and give a table with one row per sample and a
yes/no/failed column per tracked mutation.
This workflow is built and maintained at https://github.com/kblin/covid-spike-classification
If you found this tool useful, please cite https://www.medrxiv.org/content/10.1101/2021.03.27.21252266v1
Installation
covid-spike-classification
is distributed via this git repository, pypi or bioconda.
Bioconda
Installing via bioconda is the fastest way to get up and running:
conda create -n csc -c conda-forge -c bioconda covid-spike-classification
conda activate csc
git & pypi
When installing via git or pypi, you first need to install the external binary dependencies.
covid-spike-classification
depends on three excellent tools to do most of the work:
- tracy (versions 0.5.3 & 0.5.7 tested)
- bowtie2 (version 2.4.2 tested)
- samtools (versions 1.10 & 1.11 tested)
If you have conda
installed, the easiest way to get started is to just install these via calling
git clone https://github.com/kblin/covid-spike-classification.git
cd covid-spike-classification
conda env create -n csc -f environment.yml
conda activate csc
pip install .
Docker, Podman, Singularity
While not technically an installation method, covid-spike-classification
is also shipped as an OCI container.
To use it, you ideally run the container from a workflow management system like Snakemake
or Nextflow that will take care of mounting filesystems into the container for you.
The OCI container image is available from the Docker Hub kblin/covid-spike-classification
repository.
Setup
You also need to generate the samtools and bowtie2 indices for your reference genome. We ship a copy of NC_045512 and a script to generate these indices:
conda activate csc
cd ref
./build_indices.sh
cd ..
Usage
Assuming you used above instructions to install via conda, you can run the tool like this:
conda activate csc
covid-spike-classification --reference /path/to/your/reference.fasta --outdir /path/to/result/dir /path/to/sanger/reads/dir_or.zip
Notably, you can provide the input either as a ZIP file or as a directory, as long as they directly contain the ab1 files you want to run the analysis on.
See also the --help
output for more detailed usage information.
License
All code is available under the Apache License version 2, see the
LICENSE
file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for covid-spike-classification-0.6.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 408b959280f2092fbc9c467adfb340aa3c6e38bbc17d5ac71c99575c549a9998 |
|
MD5 | 0dfca99a7602470460aca92a4e68c41d |
|
BLAKE2b-256 | 76f88cd1cbbbcc9fc144603c0bf8e8ec1be4327d054195b953c9fbdbca0ce709 |
Hashes for covid_spike_classification-0.6.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 669c387dc7493a5949417d42f7e667bf73034c2edcc8c5de54fce253bc76c059 |
|
MD5 | 64336c30f9bb71959bf74b7b9c8fbfa5 |
|
BLAKE2b-256 | a70360612fa94395df59ddc21ffb5303a3c16407e94d569fb0c2ba5e3701127d |