Skip to main content

VICON - Viral Conserved Sequence Extraction Toolkit

Project description

VICON - Viral Sequence Analysis Toolkit

VICON is a Python package for processing and analyzing viral sequence data, with specialized tools for viral genome coverage analysis and sequence alignment.

Features

  • Viral sequence alignment and coverage analysis
  • K-mer analysis and sliding window coverage calculations
  • Visualization tools for coverage plots
  • Wrapper scripts for vsearch and ViralMSA

Installation

Option 1: Conda (Recommended)

The easiest way to install VICON with all dependencies:

# Create and activate a new environment
conda create -n vicon python=3.11
conda activate vicon

# Install VICON and all dependencies
conda install -c conda-forge -c bioconda -c eka97 vicon

# Set required permissions
chmod +x "$CONDA_PREFIX/bin/vicon-run"
chmod +x "$CONDA_PREFIX/bin/viralmsa"
chmod +x "$CONDA_PREFIX/bin/minimap2"

Option 2: PyPI (pip)

Install from PyPI:

pip install vicon

Note: When installing via pip, you must manually install these external dependencies:

  • minimap2 (≥2.30)
  • vsearch
  • ViralMSA

Installing External Dependencies

Ubuntu / Debian:

sudo apt-get update
sudo apt-get install -y minimap2 vsearch

macOS (Homebrew):

brew install minimap2 vsearch

ViralMSA:

mkdir -p ~/bin && cd ~/bin
wget "https://raw.githubusercontent.com/niemasd/ViralMSA/master/ViralMSA.py"
chmod +x ViralMSA.py
ln -sf "$PWD/ViralMSA.py" ~/.local/bin/viralmsa

Usage

Run the VICON pipeline with:

vicon-run --config path/to/your/config.yaml

Input FASTA Preprocessing

Note:
VICON automatically preprocesses your input FASTA files (both sample and reference) before analysis:

  • Converts all sequences to uppercase
  • Cleans and standardizes FASTA headers
  • Replaces any non-ATCG characters in sequences with 'N'

You do not need to manually edit or check your FASTA files for these issues.

Example Configuration

Create a configuration file (config.yaml):

project_path: "project_path"
virus_name: "orov"
input_sample: "data/orov/samples/samples.fasta"
input_reference: "data/orov/reference/reference.fasta"
email: "email@address.com"
kmer_size: 150
threshold: 147 # shows a tolerance of 150-147 = 3 degenerations
l_gene_start: 8000
l_gene_end: 16000
coverage_ratio: 0.5
min_year: 2020
threshold_ratio: 0.01
drop_old_samples: false
drop_mischar_samples: true

FASTA Header Year Extraction

The pipeline automatically extracts years from FASTA headers using a two-step approach:

  1. Priority extraction: Years following separators (|, _, /, -)
  2. Fallback extraction: Any standalone 4-digit number between 1850-2030
Header Example Year Extracted? Extracted Year Reason
`>sample 2021` ✅ Yes 2021
>sample_2020 ✅ Yes 2020 After underscore separator
>sample/2019/data ✅ Yes 2019 After slash separator
>sample-2022-final ✅ Yes 2022 After dash separator
>data 2021 sequence ✅ Yes 2021 Standalone 4-digit number
>sample.2020.version ✅ Yes 2020 Standalone 4-digit number
>test2021extra ✅ Yes 2021 Standalone 4-digit number
`>sample 202` ❌ No -
>sample_1800_old ❌ No - Outside valid range (1850-2030)
>sample20213long ❌ No - 5 consecutive digits

Best Practice: Use |YYYY, _YYYY, /YYYY, or -YYYY patterns for reliable year extraction.

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vicon-1.0.5.tar.gz (38.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vicon-1.0.5-py3-none-any.whl (51.3 kB view details)

Uploaded Python 3

File details

Details for the file vicon-1.0.5.tar.gz.

File metadata

  • Download URL: vicon-1.0.5.tar.gz
  • Upload date:
  • Size: 38.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for vicon-1.0.5.tar.gz
Algorithm Hash digest
SHA256 28a94f094a532e8c3fb79b4ca73ed3f799a1a1c61a5f190f609cde348df24b48
MD5 457ab485da0dcd3b02110b7ac08f1284
BLAKE2b-256 a3daf5b45d069a48e37247dc72d54526c4f8a190ac961e4cdd6fad9aea941413

See more details on using hashes here.

File details

Details for the file vicon-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: vicon-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 51.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for vicon-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0ee8225a525d378e1c6c42d1c07ce41b7b9816dd32d866c30ba404dccff6e133
MD5 35847e6598fb9c1232f0912abf9985ac
BLAKE2b-256 776bba744a35252b5e43d2084762bbfc92ab32573202b44ede63eef2259383a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page