Script to do recombinant read analysis
Project description
RAMIFI
Recombinant And Mixed-Infection Finder for SARS-CoV-2 sample. It takes input from aligned bam file (aligned to NC_045512) based on defined mutation list json file provided in the repo and output recombinant and parents reads in .bam and .tsv file with associated stats file.
Design Diagram
Dependencies
Programming/Scripting languages
- Python >=v3.8
- The pipeline has been tested in v3.8.10
Python packages
Optional packages
Installation
Install from source
Clone the ramifi
repository.
git clone https://github.com/LANL-Bioinformatics/ramifi
Then change directory to ramifi
and install.
cd ramifi
pip install .
If the installation was succesful, you should be able to type ramifi -h
and get a help message on how to use the tool.
ramifi -h
Usage
usage: ramifi.py [-h] [--refacc [STR]] [--minMixAF [FLOAT]] [--maxMixAF [FLOAT]] [--minMixed_n [INT]] [--minReadCount [INT]]
[--lineageMutation [FILE]] [--variantMutation [FILE]] [--mutations_af_plot] [--verbose] [--version] --bam [FILE]
[--vcf [File]] [--tsv [FILE]] [--outbam [File]] [-eo [PATH]] [--igv [PATH]] [--igv_variants]
Script to do recombinant read analysis
optional arguments:
-h, --help show this help message and exit
--refacc [STR] reference accession used in bam [default: NC_045512.2]
--minMixAF [FLOAT] minimum alleic frequency for checking mixed mutations on vcf [default:0.2]
--maxMixAF [FLOAT] maximum alleic frequency for checking mixed mutations on vcf [default:0.8]
--minMixed_n [INT] threshold of mixed mutations count for vcf.
--minReadCount [INT] threshold of read with variant count when no vcf provided.
--lineageMutation [FILE]
lineage mutation json file [default: variant_mutation.json]
--variantMutation [FILE]
variant mutation json file [default: lineage_mutation.json]
--mutations_af_plot generate mutations_af_plot (when --vcf provided)
--verbose Show more infomration in log
--version show program's version number and exit
Input:
--bam [FILE] <Required> bam file
--vcf [File] <Optional> vcf file which will infer the two parents of recombinant_variants
Output:
--tsv [FILE] output file name [default: recombinant_reads.tsv]
--outbam [File] output recombinant reads in bam file [default: recombinant_reads.bam]
EDGE COVID-19 Options:
options specific used for EDGE COVID-19
-eo [PATH], --ec19_projdir [PATH]
ec-19 project directory
--igv [PATH] igv.html relative path
--igv_variants add variants igv track
Test
cd tests
./runTest.sh
Outputs
-- recombinant_reads.stats: counts
total | mapped | unmapped | mutation_reads | parents | recomb1_reads | recomb2_reads | recombx_reads | parent1_reads | parent2_reads | recomb1_perc | recomb2_perc | recombx_perc |
---|---|---|---|---|---|---|---|---|---|---|---|---|
64355 | 64355 | 0 | 5203 | Omicron,Delta | 162 | 175 | 18 | 489 | 730 | 10.29 | 11.11 | 1.14 |
-- recombinant_reads.tsv
read_name | start | end | mutaions_json | note |
---|---|---|---|---|
HMVN7DRXY:2:2153:21802:16078 | 21566 | 21859 | {21618: ['Delta'], 21846: ['Iota', 'Mu', 'Omicron']} | recombinant 2 |
HMVN7DRXY:2:2166:28574:36229 | 21732 | 21883 | {21762: ['Eta', 'Omicron'], 21846: ['Iota', 'Mu', 'Omicron']} | parent Omicron |
HMVN7DRXY:2:2215:29749:15217 | 22867 | 22994 | {22917: ['Delta', 'Epsilon', 'Kappa'], 22992: ['rev of Omicron']} | parent Delta |
HMVN7DRXY:2:2105:30572:25160 | 22865 | 23023 | {22917: ['rev of Delta Epsilon Kappa'], 22992: ['rev of Omicron'], 22995: ['Delta', 'Omicron'], 23013: ['rev of Omicron']} | recombinant 1 |
HMVN7DRXY:2:2127:18304:18850 | 24058 | 24518 | {24130: ['Omicron'], 24469: ['rev of Omicron'], 24503: ['Omicron']} | recombinant x |
etc ... |
-- recombinant_reads_by_cross_region.tsv
Cross_region | Reads |
---|---|
11201-11283 | {"recomb1": ["HMVN7DRXY:2:2150:13015:23750", "HMVN7DRXY:2:2124:23746:28776", "HMVN7DRXY:2:2232:6216:33395"], "recomb2": ["HMVN7DRXY:2:2122:27624:23062"]} |
11283-11537 | {"recomb2": ["HMVN7DRXY:2:2126:12825:30154", "HMVN7DRXY:2:2126:15302:29121"]} |
21618-21846 | {"recomb2": ["HMVN7DRXY:2:2153:21802:16078", "HMVN7DRXY:2:2105:22996:5682"]} |
etc ... |
-- recombinant_reads.parent1.bam
-- recombinant_reads.parent1.bam.bai
-- recombinant_reads.parent2.bam
-- recombinant_reads.parent2.bam.bai
-- recombinant_reads.recomb1.bam
-- recombinant_reads.recomb1.bam.bai
-- recombinant_reads.recomb2.bam
-- recombinant_reads.recomb2.bam.bai
-- recombinant_reads.recombx.bam
-- recombinant_reads.recombx.bam.bai
-- recombinant_reads.mutations_af_plot.html
-- recombinant_reads.mutations_af_plot_genomeview.html
Data visualization
The recombinant_reads.bam
, ramifi/data/variants_mutation.gff
and ramifi/data/NC_045512.fasta
can be loaded into IGV.
Example: IGV Link: https://chienchi.github.io/ramifi/igv-webapp
Custom mutation list
User can custom mustaion list formated as same defined mutation list json file provided in the repo to check other variant/lineage co-infection/recombinant. When run ramifi, the custom mutation list will be taken in by the option flag --variantMutation
.
For example:
{
"Alpha": {
"A:23063:T": "S:N501Y",
"A:23403:G": "S:D614G",
...
"del:21991:3": "S:Y144*"
...
},
"Beta": {
"A:10323:G": "ORF1a:K3353R",
"A:21801:C": "S:D80A",
"A:22206:G": "S:D215G",
"A:23063:T": "S:N501Y"
...
},
"BA.2": {
...
}
}
NCBI TRACE Lineage Definitions Weekly Update Site: https://ftp.ncbi.nlm.nih.gov/pub/ACTIV-TRACE/
Remove package:
pip uninstall ramifi
Citing RAMIFI
This work is currently unpublished. If you are making use of this package, we would appreciate if you gave credit to our repository.
License
RAMIFI is distributed as open-source software under GPLv3 LICENSE and the license file included in the RAMIFI distribution.
LANL open source approval reference C22090.
© 2023. Triad National Security, LLC. All rights reserved. This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy/National Nuclear Security Administration. All rights in the program are reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear Security Administration. The Government is granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ramifi-0.3.0.tar.gz
.
File metadata
- Download URL: ramifi-0.3.0.tar.gz
- Upload date:
- Size: 37.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b3ce43bb68d4bd552dd79d70fd7142c69e0d38e17206093e78d9d4b10e9fdcd |
|
MD5 | a933434f34cc5824adb1644e5fd464a4 |
|
BLAKE2b-256 | 155b480d01fab6827d8ac59e0e92375e893209c7b853db3fcdab0a223d067e7b |
File details
Details for the file ramifi-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: ramifi-0.3.0-py3-none-any.whl
- Upload date:
- Size: 117.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ede7e7361dc1428a4c229b6ebd7182852b8c9124e5e33a85363d51037343222a |
|
MD5 | 0a466cefab808051cabaf13363631c4f |
|
BLAKE2b-256 | 714fc6282a63bcbb9cad9e20669c0f236a70ea3c46cdfd15848fe8bce9ae2c41 |