Physicochemical properties, indices and descriptors for amino-acid sequences.
Project description
peptides.py
Physicochemical properties, indices and descriptors for amino-acid sequences.
🗺️ Overview
peptides.py
is a pure-Python package to compute common descriptors for
protein sequences. It started as a port of Peptides
, the R package written by
Daniel Osorio, but now also provides
some additional features from EMBOSS,
ExPASy Protein Identification and Analysis Tools, and Rcpi.
This library has no external dependency and is available for all modern Python versions (3.6+).
📋 Features
A non-exhaustive list of available features:
- Peptide statistics: amino acid counts and frequencies.
- QSAR descriptors: BLOSUM indices, Cruciani properties, FASGAI vectors, Kidera factors, Atchley factors, MS-WHIM scores, PCP descriptors, ProtFP descriptors, Sneath vectors, ST-scales, T-scales, VHSE-scales, Z-scales.
- Sequence profiles: hydrophobicity, hydrophobic moment, membrane position.
- Physicochemical properties: aliphatic index, instability index, theoretical net charge, isoelectric point, molecular weight (with isotope labelling support).
- Biological properties: structural class prediction.
If this library is missing a useful statistic or descriptor, feel free to reach out and open a feature request on the issue tracker of the project repository.
🧊 Vectorization
Most of the descriptors for a protein sequence are simple averages of values
taken in a lookup table, so computing them can be done in a vectorized manner.
If numpy
can be imported, relevant functions
(like numpy.sum
or numpy.take
) will be used, otherwise a fallback
implementation using array.array
from the standard library is available.
🔧 Installing
Install the peptides
package directly from PyPi
which hosts universal wheels that can be installed with pip
:
$ pip install peptides
Otherwise, peptides.py
is also available as a Bioconda
package:
$ conda install -c bioconda peptides
📖 Documentation
A complete API reference
can be found in the online documentation,
or directly from the command line using
pydoc
:
$ pydoc peptides.Peptide
💡 Example
Start by creating a Peptide
object from a protein sequence:
>>> import peptides
>>> peptide = peptides.Peptide("MLKKRFLGALAVATLLTLSFGTPVMAQSGSAVFTNEGVTPFAISYPGGGT")
Then use the appropriate methods to compute the descriptors you want:
>>> peptide.aliphatic_index()
89.8...
>>> peptide.boman()
-0.2097...
>>> peptide.charge(pH=7.4)
1.99199...
>>> peptide.isoelectric_point()
10.2436...
Methods that return more than one scalar value (for instance, Peptide.blosum_indices
)
will return a dedicated named tuple:
>>> peptide.ms_whim_scores()
MSWHIMScores(mswhim1=-0.436399..., mswhim2=0.4916..., mswhim3=-0.49200...)
Use the Peptide.descriptors
method to get a dictionary with every available
descriptor. This makes it very easy to create a pandas.DataFrame
with
descriptors for several protein sequences:
>>> seqs = ["SDKEVDEVDAALSDLEITLE", "ARQQNLFINFCLILIFLLLI", "EGVNDNECEGFFSAR"]
>>> df = pandas.DataFrame([ peptides.Peptide(s).descriptors() for s in seqs ])
>>> df
BLOSUM1 BLOSUM2 BLOSUM3 BLOSUM4 ... Z2 Z3 Z4 Z5
0 0.367000 -0.436000 -0.239 0.014500 ... -0.711000 -0.104500 -1.486500 0.429500
1 -0.697500 -0.372500 -0.493 0.157000 ... -0.307500 -0.627500 -0.450500 0.362000
2 0.479333 -0.001333 0.138 0.228667 ... -0.299333 0.465333 -0.976667 0.023333
[3 rows x 66 columns]
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
🏗️ Contributing
Contributions are more than welcome! See
CONTRIBUTING.md
for more details.
⚖️ License
This library is provided under the GNU General Public License v3.0.
The original R Peptides
package was written by Daniel Osorio,
Paola Rondón-Villarreal and
Rodrigo Torres, and is licensed under
the terms of the GNU General Public License v2.0.
The EMBOSS applications are
released under the GNU General Public License v1.0.
This project is in no way not affiliated, sponsored, or otherwise endorsed
by the original Peptides
authors. It was developed
by Martin Larralde during his PhD project
at the European Molecular Biology Laboratory in
the Zeller team.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file peptides-0.3.4.tar.gz
.
File metadata
- Download URL: peptides-0.3.4.tar.gz
- Upload date:
- Size: 73.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4cd95c7aec68a1876d85bd544478d2d0106e1d884b515c861979dbe00e14a39 |
|
MD5 | 0975781e534a3b196feca9fd78861a59 |
|
BLAKE2b-256 | c25a871e514852764db28796d4b83bf327cfeab1211e400b2cdfa57cc5ee63d6 |
File details
Details for the file peptides-0.3.4-py3-none-any.whl
.
File metadata
- Download URL: peptides-0.3.4-py3-none-any.whl
- Upload date:
- Size: 122.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69ccca384d3e8df4577a1caf50f7e3687cf5c57a015706af045a1cb43e225edd |
|
MD5 | 688c072b90923493c14990f026a5b659 |
|
BLAKE2b-256 | c4cea4526cf3029462cd79ba87220bbd5ad3aba0927321a4d978a0e30f19032f |