Proteinko is used for modeling distributions of psysicochemical properties of proteins
Project description
Proteinko
Proteinko is used for modeling distributions of psysicochemical properties of proteins.
About
Protein is a sequence of amino acid residues, each characterized by a set of physical and chemical properties. By modeling properties of individual amino acid residues, mapping them to single vector representing a protein sequence and summing the overlapping portions of modeled amino acid residues, proteinko yields a distribution of physicochemical properties of protein sequence.
Proteinko has built-in schemas for following properties, although it allows adding custom schemas for any real or theoretical property of amino acid residues:
- Hydropathy
- Donor hydrogen bonds
- Acceptor hydrogen bonds
- Isoelectric point
- Van der Waals volume
Installation
pip install proteinko
Usage
To start we are going to import class called Proteinko from proteinko
package and initialize the instance of the class.
from proteinko import Proteinko
prt = Proteinko()
To list available physicochemical properties we can use the built-in method
get_schemas()
. This should produce the following output.
schemas = prt.get_schemas()
print(schemas)
>>> ['hydropathy', 'acceptors', 'donors', 'pI', 'volume']
This looks fine, but let's add one of our own schemas. We are going to use
Kyte-Doolittle hydropathy schema which is stored in a CSV file located in local
resources/
directory.
prt.add_schema(
'resources/kyte_doolittle.csv',
amino_col=0,
value_col=1,
key='kd',
header=1
)
To clarify what we did here, we passed the path to the csv file, specified the columns which contain amino acid residues and corresponding values, provided a key under which the data will be stored and let the parser know the file has 1 header row. Now if we print schemas we should see following output.
print(prt.get_schemas())
>>> ['hydropathy', 'acceptors', 'donors', 'pI', 'volume', 'kd']
Finally, in order to get a distribution of Kyte-Doolittle hydropathy across
protein sequence, let's first define our protein sequence and than call the
function get_dist()
passing the sequence and schema as function arguments.
sequence = 'ILKEPVHGV'
dist = prt.get_dist(sequence, 'kd')
If we plot our modeled distribution it should look something like this.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for proteinko-3.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d9bc47ec0492cea1a322a782188b3bb06b6aec1f9ec63d9c10e4c13141f0b06 |
|
MD5 | 61a3b04b1635bc8f50a281073df75a87 |
|
BLAKE2b-256 | aadffc1a8a50016cd42d8fe8a9add5cf08fcd42fc1149eefb1e0a2b1950b36bf |