VIRA: By-Reference Exon and CDS Viral Genome Annotation
Project description
VIRA: By-Reference Exon and CDS Viral Genome Annotation
Introduction
VIRA is a fully-automated protocol for transfering annotations from reference to target genomes, optimized for viral genomes and primarily developed and tested on HIV and SIV genomes.
The method uses both nucleotide and protein sequence information to search for correct alignments between genomes with high degrees of sequence divergence. VIRA is tailored to take advantage of guide protein annotations to further improve the accuracy of alignments and final annotations.
Publications
Coming soon...
Documentation
Installation
Via PyPI
The easiest way to install VIRA is through PyPI:
$ pip install vira-av
$ vira --help
To uninstall VIRA:
$ pip uninstall vira-av
Building from source
To build from source, clone the git repository:
$ git clone https://github.com/alevar/vira.git --recursive
$ cd vira
$ pip install -r requirements.txt
$ pip install .
Requirements
| Requirement | Details |
|---|---|
| Language support | Python ≥ 3.6 |
| Dependencies | - gffread - minimap2 - miniprot - snapper |
Getting started
Usage
vira [-h] -a ANNOTATION -g GENOME -t TARGET [-q GUIDE] [-o OUTPUT] [--force-cds]
[--gffread GFFREAD] [--minimap2 MINIMAP2] [--miniprot MINIPROT] [--snapper SNAPPER]
[--keep-tmp] [--tmp-dir TMP_DIR]
Options
| Option | Description |
|---|---|
-a, --annotation |
Path to the query GTF/GFF annotation file. |
-g, --genome |
Path to the query genome FASTA file. |
-t, --target |
Path to the target genome FASTA file. |
-q, --guide |
Optional path to the guide annotation file for the target genome. Transcripts and CDS from the guide will be used to validate the annotation. |
-o, --output |
Path to the output GTF file. |
--force-cds |
Force the CDS from the guide onto the transcript chain, even if that means merging adjacent exons together (can fix alignment artifacts such as spurious introns). If the CDS does not fit the transcript chain, the transcript will be skipped. |
--gffread |
Path to the gffread executable. |
--minimap2 |
Path to the minimap2 executable. |
--miniprot |
Path to the miniprot executable. If not set - minimap2 will be used to align nucleotide sequence of the CDS instead. |
--snapper |
Path to the snapper executable. |
--keep-tmp |
Keep temporary files. |
--tmp-dir |
Directory to store temporary files. |
Help Options
| Option | Description |
|---|---|
-h, --help |
Prints help message. |
Example Data
Sample datasets are provided in the "example" directory to test and get familiar with VIRA.
The included example can be run with the following command from the root directory of the repository:
vira --annotation example/query.gtf --output example/output.gtf --genome example/query.fasta --target example/target.fasta --guide example/guide.gtf
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vira_av-1.0.1.tar.gz.
File metadata
- Download URL: vira_av-1.0.1.tar.gz
- Upload date:
- Size: 58.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd079cf4cbdad392e20cef4586aab6107d868ab013caad38b27fff7d67b7bb28
|
|
| MD5 |
ee35daa5050b2e91794fedff5c6bf7b1
|
|
| BLAKE2b-256 |
2e0826d037767c91888e85555ce359ba1ad1f6af7f38d6e8c384078f2ae94a07
|
File details
Details for the file vira_av-1.0.1-py3-none-any.whl.
File metadata
- Download URL: vira_av-1.0.1-py3-none-any.whl
- Upload date:
- Size: 60.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0341fbd0d47272d548b93ef93779c834e480a076be6fb8772fee926b54ccd940
|
|
| MD5 |
64bdf355395055fd553a89cd39438a7c
|
|
| BLAKE2b-256 |
da24e3c2252f7df9d7a93db59486e42cfeaf5656629fcee31e029bcfba1890f4
|