Skip to main content

Script to do recombinant read analysis

Project description

Python License: GPL v3

RAMIFI

Recombinant And Mixed-Infection Finder for SARS-CoV-2 sample. It takes input from aligned bam file (aligned to NC_045512) based on defined mutation list json file provided in the repo and output recombinant and parents reads in .bam and .tsv file with associated stats file.

Design Diagram

Ramifi_design_diagram

Dependencies

Programming/Scripting languages

Python packages

Optional packages

Installation

Install from source

Clone the ramifi repository.

git clone https://github.com/LANL-Bioinformatics/ramifi

Then change directory to ramifi and install.

cd ramifi
pip install .

If the installation was succesful, you should be able to type ramifi -h and get a help message on how to use the tool.

ramifi -h

Usage

usage: ramifi.py [-h] [--refacc [STR]] [--minMixAF [FLOAT]] [--maxMixAF [FLOAT]] [--minMixed_n [INT]] [--minReadCount [INT]]
                 [--lineageMutation [FILE]] [--variantMutation [FILE]] [--mutations_af_plot] [--verbose] [--version] --bam [FILE]
                 [--vcf [File]] [--tsv [FILE]] [--outbam [File]] [-eo [PATH]] [--igv [PATH]] [--igv_variants]

Script to do recombinant read analysis

optional arguments:
  -h, --help            show this help message and exit
  --refacc [STR]        reference accession used in bam [default: NC_045512.2]
  --minMixAF [FLOAT]    minimum alleic frequency for checking mixed mutations on vcf [default:0.2]
  --maxMixAF [FLOAT]    maximum alleic frequency for checking mixed mutations on vcf [default:0.8]
  --minMixed_n [INT]    threshold of mixed mutations count for vcf.
  --minReadCount [INT]  threshold of read with variant count when no vcf provided.
  --lineageMutation [FILE]
                        lineage mutation json file [default: variant_mutation.json]
  --variantMutation [FILE]
                        variant mutation json file [default: lineage_mutation.json]
  --mutations_af_plot   generate mutations_af_plot (when --vcf provided)
  --verbose             Show more infomration in log
  --version             show program's version number and exit

Input:
  --bam [FILE]          <Required> bam file
  --vcf [File]          <Optional> vcf file which will infer the two parents of recombinant_variants

Output:
  --tsv [FILE]          output file name [default: recombinant_reads.tsv]
  --outbam [File]       output recombinant reads in bam file [default: recombinant_reads.bam]

EDGE COVID-19 Options:
  options specific used for EDGE COVID-19

  -eo [PATH], --ec19_projdir [PATH]
                        ec-19 project directory
  --igv [PATH]          igv.html relative path
  --igv_variants        add variants igv track

Test

cd tests
./runTest.sh

Outputs

-- recombinant_reads.stats: counts

total mapped unmapped mutation_reads parents recomb1_reads recomb2_reads recombx_reads parent1_reads parent2_reads recomb1_perc recomb2_perc recombx_perc
64355 64355 0 5203 Omicron,Delta 162 175 18 489 730 10.29 11.11 1.14

-- recombinant_reads.tsv

read_name start end mutaions_json note
HMVN7DRXY:2:2153:21802:16078 21566 21859 {21618: ['Delta'], 21846: ['Iota', 'Mu', 'Omicron']} recombinant 2
HMVN7DRXY:2:2166:28574:36229 21732 21883 {21762: ['Eta', 'Omicron'], 21846: ['Iota', 'Mu', 'Omicron']} parent Omicron
HMVN7DRXY:2:2215:29749:15217 22867 22994 {22917: ['Delta', 'Epsilon', 'Kappa'], 22992: ['rev of Omicron']} parent Delta
HMVN7DRXY:2:2105:30572:25160 22865 23023 {22917: ['rev of Delta Epsilon Kappa'], 22992: ['rev of Omicron'], 22995: ['Delta', 'Omicron'], 23013: ['rev of Omicron']} recombinant 1
HMVN7DRXY:2:2127:18304:18850 24058 24518 {24130: ['Omicron'], 24469: ['rev of Omicron'], 24503: ['Omicron']} recombinant x
etc ...

-- recombinant_reads_by_cross_region.tsv

Cross_region Reads
11201-11283 {"recomb1": ["HMVN7DRXY:2:2150:13015:23750", "HMVN7DRXY:2:2124:23746:28776", "HMVN7DRXY:2:2232:6216:33395"], "recomb2": ["HMVN7DRXY:2:2122:27624:23062"]}
11283-11537 {"recomb2": ["HMVN7DRXY:2:2126:12825:30154", "HMVN7DRXY:2:2126:15302:29121"]}
21618-21846 {"recomb2": ["HMVN7DRXY:2:2153:21802:16078", "HMVN7DRXY:2:2105:22996:5682"]}
etc ...

-- recombinant_reads.parent1.bam

-- recombinant_reads.parent1.bam.bai

-- recombinant_reads.parent2.bam

-- recombinant_reads.parent2.bam.bai

-- recombinant_reads.recomb1.bam

-- recombinant_reads.recomb1.bam.bai

-- recombinant_reads.recomb2.bam

-- recombinant_reads.recomb2.bam.bai

-- recombinant_reads.recombx.bam

-- recombinant_reads.recombx.bam.bai

-- recombinant_reads.mutations_af_plot.html

-- recombinant_reads.mutations_af_plot_genomeview.html

Data visualization

The recombinant_reads.bam, ramifi/data/variants_mutation.gff and ramifi/data/NC_045512.fasta can be loaded into IGV.

Example: IGV Link: https://chienchi.github.io/ramifi/igv-webapp

Screen Shot 2022-06-13 at 9 51 08 PM

Custom mutation list

User can custom mustaion list formated as same defined mutation list json file provided in the repo to check other variant/lineage co-infection/recombinant. When run ramifi, the custom mutation list will be taken in by the option flag --variantMutation.

For example:

{
    "Alpha": {
        "A:23063:T": "S:N501Y",
        "A:23403:G": "S:D614G",
        ...
        "del:21991:3": "S:Y144*"
        ...
    },
    "Beta": {
        "A:10323:G": "ORF1a:K3353R",
        "A:21801:C": "S:D80A",
        "A:22206:G": "S:D215G",
        "A:23063:T": "S:N501Y"
        ...
    },
    "BA.2": {
        ...
    }
}

NCBI TRACE Lineage Definitions Weekly Update Site: https://ftp.ncbi.nlm.nih.gov/pub/ACTIV-TRACE/

Remove package:

pip uninstall ramifi

Citing RAMIFI

This work is currently unpublished. If you are making use of this package, we would appreciate if you gave credit to our repository.

License

RAMIFI is distributed as open-source software under GPLv3 LICENSE and the license file included in the RAMIFI distribution.

LANL open source approval reference C22090.

© 2023. Triad National Security, LLC. All rights reserved. This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy/National Nuclear Security Administration. All rights in the program are reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear Security Administration. The Government is granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ramifi-0.3.0.tar.gz (37.1 kB view details)

Uploaded Source

Built Distribution

ramifi-0.3.0-py3-none-any.whl (117.7 kB view details)

Uploaded Python 3

File details

Details for the file ramifi-0.3.0.tar.gz.

File metadata

  • Download URL: ramifi-0.3.0.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.1

File hashes

Hashes for ramifi-0.3.0.tar.gz
Algorithm Hash digest
SHA256 4b3ce43bb68d4bd552dd79d70fd7142c69e0d38e17206093e78d9d4b10e9fdcd
MD5 a933434f34cc5824adb1644e5fd464a4
BLAKE2b-256 155b480d01fab6827d8ac59e0e92375e893209c7b853db3fcdab0a223d067e7b

See more details on using hashes here.

File details

Details for the file ramifi-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ramifi-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 117.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.1

File hashes

Hashes for ramifi-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ede7e7361dc1428a4c229b6ebd7182852b8c9124e5e33a85363d51037343222a
MD5 0a466cefab808051cabaf13363631c4f
BLAKE2b-256 714fc6282a63bcbb9cad9e20669c0f236a70ea3c46cdfd15848fe8bce9ae2c41

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page