Skip to main content

HUNANA

Project description

Hunana

A modular implementation of Hunana. A sub-module of ViVA.

The conserved sequences of the viral protein sequences are considered as candidates for vaccine design against continuously mutating viruses. Nonameric sequences from the viral genome are recognized and processed by human leukocyte antigens and T cell receptors. HUNANA is a command-line based tool which can provide a list of positions of conserved nonameric (kmer) sequences for a given viral protein sequence by utilizing Shannon’s entropy formula.

Installation

OPTION 1

pip install hunana

OPTION 2

pip install git+https://github.com/pu-sds/hunana.git

OPTION 3

git clone https://github.com/pu-sds/hunana.git
cd hunana
python setup.py install

OPTION 4

Download the latest distribution at:

https://github.com/pu-sds/hunana/releases/latest

Install using:

$ pip install hunana-{version}.whl

Command-Line Usage

Once installation is complete, an executable will be added to PATH which can be accessed as below:

Linux

hunana -h

Windows

hunana.exe -h

Basic Usage

hunana -i sequences.fasta -o output.json -l 9

hunana -i sequences.fasta | grep supports

Basic Usage Output (Example)
[
  {
    "position":1,
    "entropy":1.0002713744986218,
    "supports":2,
    "variants":[
      {
        "position":1,
        "sequence":"SKGKRTVDL",
        "count":1,
        "incidence":50.0,
        "motif_short":"I",
        "motif_long":"Index"
      },
      {
        "position":1,
        "sequence":"FHWLMLNPN",
        "count":1,
        "incidence":50.0,
        "motif_short":"Ma",
        "motif_long":"Major"
      }
    ],
    "kmer_types":{
      "incidence":50.0,
      "types":[
        "FHWLMLNPN"
      ]
    }
  }
]

Advanced Usage (Generate Variant Data)

The flag --he/--header along with the -f/--format header can be used to generate data for each variant using the metadata from the fasta sequence header.

hunana -i sequences.fasta -o output.json -he -f "(type)|(id)|(strain)"

Each componant (ex: id, strain, country, etc)of the header needs to be wrapped in brackets. Any separator (Ex: |, /, _, etc) can be used.

Advanced Usage Output (Example)
[
  {
    "position":1,
    "entropy":1.0001724373828909,
    "supports":2,
    "variants":[
      {
        "position":1,
        "sequence":"SKGKRTVDL",
        "count":1,
        "incidence":50.0,
        "motif_short":"I",
        "motif_long":"Index",
        "type":[
          "tr"
        ],
        "accession":[
          "A0A2Z4MTJ4"
        ],
        "strain":[
          "A0A2Z4MTJ4_9HIV2_Envelope_glycoprotein_gp160_OS_Human_immunodeficiency_virus_2_OX_11709_GN_env_PE_4_SV_1"
        ]
      },
      {
        "position":1,
        "sequence":"FHWLMLNPN",
        "count":1,
        "incidence":50.0,
        "motif_short":"Ma",
        "motif_long":"Major",
        "type":[
          "tr"
        ],
        "accession":[
          "A0A0K2GVL2"
        ],
        "strain":[
          "A0A2Z4MTJ4_9HIV2_Envelope_glycoprotein_gp160_OS_Human_immunodeficiency_virus_2_OX_11709_GN_env_PE_4_SV_1"
        ]
      }
    ],
    "kmer_types":{
      "incidence":50.0,
      "types":[
        "FHWLMLNPN"
      ]
    }
  }
]

Command-Line Arguments

Argument Type Default Example Description
-h N/A N/A hunana -h Prints a summary of all available command-line arguments.
-i String N/A hunana -i '/path/to/alignment.fasta' Absolute path to the aligned sequences file in FASTA format.
-o String N/A hunana -i '/path/to/alignment.fasta' -o output.json Absolute path to the output JSON file.
-l Integer 9 hunana -i '/path/to/alignment.fasta' -l 12 The length of the generated k-mers.
-s Integer 10000 hunana -i '/path/to/alignment.fasta' -s 20000 Maximum number of samples use when calculating entropy.
-it Integer 10 hunana -i '/path/to/alignment.fasta' -it 100 Maximum number of iterations used when calculating entropy.
-he Boolean False hunana -i '/path/to/alignment.fasta' -he -f '(type)|(accession)|(strain)|(country)' Enables decoding of the FASTA headers to derive details for each generated k-mer.
-f String N/A hunana -i '/path/to/alignment.fasta' -he -f '(type)|(accession)|(strain)|(country)' The format of the FASTA header in the FASTA Multiple Sequence Alignment.
-no_header_error Boolean False hunana -i '/path/to/alignment.fasta' -he -f '(type)|(accession)|(strain)|(country)' -no_header_error Whether to raise an error if empty items are found in any of the FASTA headers.

More Examples

hunana -i sequences.fasta -o output.json -he -f "(ncbid)/(strain)/(host)/(country)"

hunana -i sequences.fasta -o output.json -he -f "(ncbid)/(strain)/(host)|(country)"

hunana -i sequences.fasta -o output.json -he -f "(ab)/(cde)/(fghi)/(jklm)"

hunana -i sequences.fasta -o output.json -he -f "(ab)/(cde)/(fghi)/(jklm) -no_header_error"

Module Usage

Hunana can also be imported and used within your Python projects as below:

from hunana import Hunana
Hunana('/path/to/sequence.fasta').run()

Module Parameters

Argument Type Default Description
seqs str, TextIOWrapper, StringIO N/A A file handle, a FASTA sequence wrapped in a handle, or a filepath.
kmer_len int 9 The length of the kmers to generate.
header_decode bool False Whether to use FASTA headers to derive kmer information.
header_format str N/A The format of the header (ex: (id)|(species)|(country)).
json_result bool False Whether the results should be returned in json format.
max_samples int 10000 The maximum number of samples to use when calculating entropy.
iterations int 10 The maximum number of iterations to use when calculating entropy.
no_header_error bool False Whether to raise an error if empty items are found in any of the FASTA headers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HUNANA-1.0.7.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

HUNANA-1.0.7-py2.py3-none-any.whl (17.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file HUNANA-1.0.7.tar.gz.

File metadata

  • Download URL: HUNANA-1.0.7.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for HUNANA-1.0.7.tar.gz
Algorithm Hash digest
SHA256 3e8f6a6d0cdd6a9c5bfb68e1b71850489cfddac5d10c1d9e14d1acc28b3e17f5
MD5 36751335c0be3f06538555f3075725a6
BLAKE2b-256 eda51e8d7baffcf71d5c76b2dfc0cf59963150e045e220152ca93cc3bf1ff0fb

See more details on using hashes here.

File details

Details for the file HUNANA-1.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: HUNANA-1.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for HUNANA-1.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1f92314b10eb43be288ec63e15db52c6e64647e33a6db3688dec7b80757d8ad2
MD5 87977be4a3e145fdacccb22626859740
BLAKE2b-256 fdb3fc3a128b694bde584e477ff51dd8f908262007477cf30c8bbe840e644de6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page