Skip to main content

aaindex is a lightweight Python software package for accessing the data in the various AAIndex databases, which represent the physiochemical and biochemical properties of amino acids as numerical indices.

Project description

Python package for working with the AAIndex database (https://www.genome.jp/aaindex/)

AAIndex Platforms PythonV pytest

License: MIT Issues Size

Table of Contents

Introduction

The AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. The AAindex consists of three sections: AAindex1 for the amino acid index of 20 numerical values, AAindex2 for the amino acid mutation matrix and AAindex3 for the statistical protein contact potentials. All data are derived from published literature [1].

This aaindex Python software package is a very lightweight way of accessing the data represented in the various AAIndex databases. Minimal requirements and external libraries are required to use the package and any record and its associated data/numerical indices can be accessed in one line. Currently the software supports the AAIndex1 database with plans to include the AAIndex 2 & 3 in the future. The format of an AAIndex1 record can be seen below.

Format of AAIndex1 record

alt text

  ************************************************************************
  *                                                                      *
  * H Accession number                                                   *
  * D Data description                                                   *
  * R Pub med article ID (PMID)                                          *
  * A Author(s)                                                          *
  * T Title of the article                                               *
  * J Journal reference                                                  *
  * * Comment or missing                                                 *
  * C Accession numbers of similar entries with the correlation          *
  *   coefficients of 0.8 (-0.8) or more (less).                         *
  *   Notice: The correlation coefficient is calculated with zeros       *
  *   filled for missing values.                                         *
  * I Amino acid index data in the following order                       *
  *   Ala    Arg    Asn    Asp    Cys    Gln    Glu    Gly    His    Ile *
  *   Leu    Lys    Met    Phe    Pro    Ser    Thr    Trp    Tyr    Val *
  * //                                                                   *
  ************************************************************************

A demo of the software is available here.

Requirements

Installation

Install the latest version of aaindex using pip:

pip3 install aaindex --upgrade

Install by cloning repository:

git clone https://github.com/amckenna41/aaindex.git
python3 setup.py install

Usage

The AAIndex module is made up of three modules for each AAindex database, with each having a Python class of the same name, when importing the package you should import the required database module:

from aaindex import aaindex1
# from aaindex import aaindex2
# from aaindex import aaindex3

Get record from AAIndex1

The AAindex1 class offers diverse functionalities for obtaining any element from any record in the database. The records are imported from a parsed json aaindex_json in the data folder of the package. You can search for a particular record by its index/record code, description or reference. You can also get the index category, and importantly its associated amino acid values:

from aaindex import aaindex1

full_record = aaindex1['CHOP780206']   #get full AAI record
''' full_record ->
{'category': 'sec_struct', 'correlation_coefficients': {}, 'description': 'Normalized frequency of N-terminal non helical region (Chou-Fasman, 1978b)', 'notes': '', 'pmid': '364941', 'references': "Chou, P.Y. and Fasman, G.D. 'Prediction of the secondary structure of proteins from their amino acid sequence' Adv. Enzymol. 47, 45-148 (1978)", 'values': {'-': 0, 'A': 0.7, 'C': 0.65, 'D': 0.98, 'E': 1.04, 'F': 0.93, 'G': 1.41, 'H': 1.22, 'I': 0.78, 'K': 1.01, 'L': 0.85, 'M': 0.83, 'N': 1.42, 'P': 1.1, 'Q': 0.75, 'R': 0.34, 'S': 1.55, 'T': 1.09, 'V': 0.75, 'W': 0.62, 'Y': 0.99}}
'''
#get individual elements of AAIndex record
record_values = aaindex1['CHOP780206']['values'] 
record_values = aaindex1['CHOP780206'].values
#'values': {'-': 0, 'A': 0.7, 'C': 0.65, 'D': 0.98, 'E': 1.04, 'F': 0.93, 'G': 1.41, 'H': 1.22, 'I': 0.78, 'K': 1.01, 'L': 0.85, 'M': 0.83, 'N': 1.42, 'P': 1.1, 'Q': 0.75, 'R': 0.34, 'S': 1.55, 'T': 1.09, 'V': 0.75, 'W': 0.62, 'Y': 0.99}

record_description = aaindex1['CHOP780206']['description']
record_description = aaindex1['CHOP780206'].description
#'description': 'Normalized frequency of N-terminal non helical region (Chou-Fasman, 1978b)'

record_references = aaindex1['CHOP780206']['references']
record_references = aaindex1['CHOP780206'].references
#'references': "Chou, P.Y. and Fasman, G.D. 'Prediction of the secondary structure of proteins from their amino acid sequence' Adv. Enzymol. 47, 45-148 (1978)"

record_notes = aaindex1['CHOP780206']['notes']
record_notes = aaindex1['CHOP780206'].notes
#""

record_correlation_coefficient = aaindex1['CHOP780206']['correlation_coefficient']
record_correlation_coefficient = aaindex1['CHOP780206'].correlation_coefficient
#{}

record_pmid = aaindex1['CHOP780206']['pmid']  
record_pmid = aaindex1['CHOP780206'].pmid
#364941

record_category = aaindex1['CHOP780206']['category']
record_category = aaindex1['CHOP780206'].category
#sec_struct

Get total number of AAIndex records

#get total number of records in AAI database
aaindex1.num_records()

Get list of all AAIndex record names

#get list of all AAIndex record names
aaindex1.record_names()

Directories

  • /tests - unit and integration tests for aaindex package.
  • /aaindex - source code and all required external data files for package.
  • /images - images used throughout README.
  • /docs - aaindex documentation.

Tests

To run all tests, from the main aaindex folder run:

python3 -m unittest discover tests

Contact

If you have any questions or comments, please contact amckenna41@qub.ac.uk or raise an issue on the Issues tab.

License

Distributed under the MIT License. See LICENSE for more details.

References

[1]: Shuichi Kawashima, Minoru Kanehisa, AAindex: Amino Acid index database, Nucleic Acids Research, Volume 28, Issue 1, 1 January 2000, Page 374, https://doi.org/10.1093/nar/28.1.374
[2]: https://www.genome.jp/aaindex/

Back to top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aaindex-1.0.5.tar.gz (360.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

aaindex-1.0.5-py3.8.egg (391.1 kB view details)

Uploaded Egg

aaindex-1.0.5-py3-none-any.whl (373.7 kB view details)

Uploaded Python 3

File details

Details for the file aaindex-1.0.5.tar.gz.

File metadata

  • Download URL: aaindex-1.0.5.tar.gz
  • Upload date:
  • Size: 360.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for aaindex-1.0.5.tar.gz
Algorithm Hash digest
SHA256 06668c2b96dfb20ee1680906476aa88921f1a72db78ae7ce5e88938f16d310d4
MD5 f5d6038c0f8cf529546722a13c4750b4
BLAKE2b-256 9b9002af97fa13db1b7aebc35578c571041e96bbc6c26c714738fb19c6423029

See more details on using hashes here.

File details

Details for the file aaindex-1.0.5-py3.8.egg.

File metadata

  • Download URL: aaindex-1.0.5-py3.8.egg
  • Upload date:
  • Size: 391.1 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for aaindex-1.0.5-py3.8.egg
Algorithm Hash digest
SHA256 7acedd94a28e01960cdab09bd8095dccd7e5ae69dc4d08e906c6bb44b6425af9
MD5 904019feaa75b4ebbc1ac9b74cea27ea
BLAKE2b-256 5acfddbf114f1970c64572dc29ede4c1f6663d86e88a004cf0ff60c0bb40a436

See more details on using hashes here.

File details

Details for the file aaindex-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: aaindex-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 373.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for aaindex-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 4624ce0dcc50a7668fc1b54e217f5bb1986b3ed3de41d66f8d37cb54df8c509a
MD5 7d9f762c1d2e87abe8fa851b68e08bc1
BLAKE2b-256 18b7f65ac1a5248124950c6bf3083a4130b30064e6522069a0fbfc7a44c0ce4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page