Skip to main content

Identify frequencies of concerning mutations from aligned reads

Project description

Alcov

Abundance learning for SARS-CoV-2 variants. The primary purpose of the tool is:

  • Estimating abundace of variants of concern from wastewater sequencing data

You can read more about how Alcov works in the preprint, Alcov: Estimating Variant of Concern Abundance from SARS-CoV-2 Wastewater Sequencing Data

The tool can also be used for:

The tool is under active development. If you have questions or issues, please open an issue on GitHub or email me (email in setup.py).

Installing

The latest release can be downloaded from PyPI

pip install alcov

This will install the Python library and the CLI.

To install the development version, clone the repository and run

pip install .

Usage example

Preprocessing

Alcov expects a BAM file of reads aligned to the SARS-CoV-2 reference genome. For an example of how to process Illumina reads, check the prep directory for a script named "prep.py" which outlines our current preprocessing pipeline, including the generation of a "samples.txt" file used by alcov "find_lineages" command.

Estimating relative abundance of variants of concern:

alcov find_lineages reads.bam

Finding lineages in BAM files for multiple samples:

alcov find_lineages samples.txt

Where samples.txt looks like:

path/to/reads1.bam	Sample 1 name
path/to/reads2.bam	Sample 2 name
...

Example usage: To estimate the relative abundance of lineages in a list of samples (samples.txt), while considering only positions with a minimum depth of 10 reads, the following command can be used. This will also save the heatmap as a .png image and the corresponding frequencies as a csv file.

alcov find_lineages --min_depth=10 --save_img=True --csv=True samples.txt

Optionally specify which VOCs to look for (Note: This will restrict alcov to only consider the lineages specified in this text file. Do not provide this file if you wish alcov to consider all lineages for which it has constellation files.)

alcov find_lineages reads.bam lineages.txt

Where lineages.txt looks like: Note: These lineages must be chosen from the list of lineages that alcov has constellation files for (updated weekly) found in "alcov/alcov/data/constellations/"

BA.5-like
BQ.1.1-like
XBB-like
XBB.1.5-like
...

Optionally change minimum read depth (default 40)

alcov find_lineages --min_depth=5 reads.bam

Optionally show how predicted mutation rates agree with observed mutation rates

alcov find_lineages --show_stacked=True reads.bam

Use mutations which are found in multiple VOCs (can help for low coverage samples) - This is now the default behaviour.

alcov find_lineages --unique=False reads.bam

Plotting change in lineage distributions over time for multiple sites

alcov find_lineages --ts samples.txt

Where samples.txt looks like:

reads1.bam	SITE1_2021-09-10
reads2.bam	SITE1_2021-09-12
...
reads3.bam	SITE2_2021-09-10
reads4.bam	SITE2_2021-09-12
...

Converting mutation names:

$ alcov nt A23063T
A23063T causes S:N501Y
$ alcov aa S:E484K
G23012A causes S:E484K

Finding mutations in BAM file:

alcov find_mutants reads.bam

Finding mutations in BAM files for multiple samples:

alcov find_mutants samples.txt

Where samples.txt looks like:

reads1.bam	Sample 1 name
reads2.bam	Sample 2 name
...

Running find_mutants will print the number of reads with and without each mutation in each sample and then generate a heatmap showing the frequencies for all samples.

You can also specify a custom mutations file:

alcov find_mutants samples.txt mutations.txt

Where mutations.txt looks like:

S:N501Y
G23012A
...

Getting the read depth for each amplicon

alcov amplicon_coverage reads.bam

or

alcov amplicon_coverage samples.txt

Plotting amplicon GC content against amplicon depth

alcov gc_depth reads.bam

or

alcov gc_depth samples.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alcov-1.1.11.tar.gz (627.3 kB view details)

Uploaded Source

Built Distribution

alcov-1.1.11-py3-none-any.whl (905.9 kB view details)

Uploaded Python 3

File details

Details for the file alcov-1.1.11.tar.gz.

File metadata

  • Download URL: alcov-1.1.11.tar.gz
  • Upload date:
  • Size: 627.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for alcov-1.1.11.tar.gz
Algorithm Hash digest
SHA256 226a545e036e3682d0a6eb35b24ff5e5b206c4b2c2035831ddd2e460005f58ec
MD5 390eb987c9e06424d096b1e169fe3a40
BLAKE2b-256 65add6533afe7755bf9745d83e476b3ff7dd3111fe927b22b195d62f865aa4e3

See more details on using hashes here.

File details

Details for the file alcov-1.1.11-py3-none-any.whl.

File metadata

  • Download URL: alcov-1.1.11-py3-none-any.whl
  • Upload date:
  • Size: 905.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for alcov-1.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 0a591cad6c96a73857f93a0a26cfff40941c7de89cd16f6cad19f419ae2fa344
MD5 434bf799d3500fd39df58b3f97a01b36
BLAKE2b-256 ad76ba7a1c227f1d93b03fb3738a368d6106eb0e7b9d135c8b5baf6cc4ea3711

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page