Skip to main content

Calculates the isoelectric point (pI) of a peptide/protein using the Henderson-Hasselbalch equation with the predicted pKa values of its amino acids.

Project description

pIChemiSt

Description

The program calculates the isoelectric point of proteins or peptides based on their 2D molecular structure. The input structure is cut into monomers by targeting its amide bonds, and then each monomer's pKa value is determined using different methods: natural amino acids pKa values are matched against a dictionary; non-natural amino acid values are calculated using either pKaMatcher (built-in tool based on SMARTS patterns) or ACD perceptabat GALAS algorithm (a commercial tool that requires licence). For natural amino acids the following predefined sets of amino-acid pKa values are implemented: 'IPC2_peptide', 'IPC_peptide', 'ProMoST', 'Gauci', 'Rodwell', 'Grimsley', 'Thurlkill', 'Solomon', 'Lehninger', 'EMBOSS'. The mean value and variation between different sets are also calculated as well as the total charge at pH 7.4. The program can also plot the corresponding pH/Q curves for each input structure. Please refer to pIChemiSt publication for more details: https://pubs.acs.org/doi/10.1021/acs.jcim.2c01261

How to install the software via pypi

  • Ensure that you have Python version >=3.8
  • Run pip install pichemist
  • (optional) - To use ACD for the prediction of non-natural amino acid pKa, make sure that the command perceptabat points to its binary

How to install the software via Github

  • Clone the repository
  • Ensure that you have Python version >=3.8
  • Enter the package folder cd peptide-tools/pIChemiSt
  • Run pip install . to install the Python library and the CLI command
  • (optional) - To use ACD for the prediction of non-natural amino acid pKa, make sure that the command perceptabat points to its binary

Examples of usage (CLI)

# Run the predictor against a SMILES file using pKaMatcher and output results to console
pichemist -i test/examples/payload_1.smi --method pkamatcher

>
======================================================================================================================================================
pI
---------------------------------
     pI mean  9.02
         err  0.61
         std  1.72
IPC2_peptide  8.05
 IPC_peptide  9.81
     ProMoST  8.38
       Gauci  8.69
    Grimsley  8.94
   Thurlkill  9.06
   Lehninger  9.86
    Toseland  9.41


======================================================================================================================================================
Q at pH7.4
---------------------------------
Q at pH7.4 mean  0.73
         err  0.24
         std  0.67
IPC2_peptide  0.63
 IPC_peptide  0.99
     ProMoST  0.26
       Gauci  0.55
    Grimsley  0.66
   Thurlkill  0.8
   Lehninger  0.99
    Toseland  0.95


pH interval with charge between -0.2 and  0.2 and prediction tool: pkamatcher
pI interval:  8.6 -  9.4

Other flavours of flags and arguments can be configured to feed different inputs or produce different outputs:

# Use SMILES string as input
pichemist -i "N[C@@]([H])(CS)C(=O)N[C@@]([H])(CC(=O)N)C(=O)N[C@@]([H])(CS)C(=O)N[C@@]([H])(CC(=O)N)C(=O)O" -if smiles_stdin

# Use SMILES string as input and output JSON to console
pichemist -i "C([C@@H](C(=O)O)N)SSC[C@@H](C(=O)O)N" -if smiles_stdin -of json

# Use FASTA as input
# Note that FASTA cannot be used to indicate if the N- and C- termini are capped or not.
# Most of natural peptides have an acid on the C-terminus, however, synthetic peptides
# may have an amide (not ionizable) at the C-terminus.
pichemist -i "MNSERSDVTLY" -if fasta_stdin

# Use SDF as input
pichemist -i test/examples/payload_4.sdf -if sdf

# Output SDF
pichemist -i test/examples/payload_1.smi -o results.sdf -of sdf

# Output JSON to console
pichemist -i test/examples/payload_1.smi -of json

# Plot pH/Q curve
pichemist -i test/examples/payload_1.smi --plot_ph_q_curve

# Plot pH/Q curve (with custom prefix 'plot')
pichemist -i test/examples/payload_2.smi --plot_ph_q_curve -pp "plot"

# Print the pKa values of fragments
pichemist -i test/examples/payload_1.smi --print_fragment_pkas

# Use ACD instead of pKaMatcher
pichemist -i test/examples/payload_1.smi --method acd

# Use ACD with SMILES string
pichemist -i "NCCC(=O)N[C@@H](Cc1c[nH]cn1)C(=O)O" --method acd -if smiles_stdin

Examples of usage (Python API)

from pichemist.io import generate_input
from pichemist.api import pichemist_from_dict

smiles = "C[C@@H](NC(=O)[C@H](CCCCN)NC(=O)[C@](C)(CC(=O)O)NC(=O)[C@H](CCCN)NC(=O)[C@@H](N)Cc1ccccc1)C(=O)O"

args = {
        "input_data": smiles,
        "input_format": "smiles_stdin",
        "plot_ph_q_curve": False,
        "print_fragments": False,
        "method": "pkamatcher",
    }

input_dict = generate_input(args["input_format"], args["input_data"])
output = pichemist_from_dict(
    input_dict, args["method"], args["plot_ph_q_curve"], args["print_fragments"]
)

print(output)
>
{1: {'mol_name': 'C[C@@H](NC(=O)[C@H](CCCCN)NC(=O)[C@](C)(CC(=O)O)NC(=O)[C@H](CCCN)NC(=O)[C@@H](N)Cc1ccccc1)C(=O)O', 'pI': {'IPC2_peptide': 8.046875, 'IPC_peptide': 9.8125, 'ProMoST': 8.375, 'Gauci': 8.6875, 'Grimsley': 8.9375, 'Thurlkill': 9.0625, 'Lehninger': 9.859375, 'Toseland': 9.40625, 'pI mean': 9.0234375, 'std': 1.721588565104915, 'err': 0.6086734743994516}, 'QpH7': {'IPC2_peptide': 0.6314906212267486, 'IPC_peptide': 0.9915539516610472, 'ProMoST': 0.26174063515548607, 'Gauci': 0.5540630760817584, 'Grimsley': 0.6645409545014482, 'Thurlkill': 0.797542965316429, 'Lehninger': 0.9932283675959863, 'Toseland': 0.9515959465104951, 'Q at pH7.4 mean': 0.7307195647561748, 'std': 0.6749606913955383, 'err': 0.23863464096007284}, 'pI_interval': (8.624999999999998, 9.362499999999997), 'pI_interval_threshold': 0.2, 'pKa_set': 'IPC2_peptide'}}

Contributions

For developers

  • To create a new build, the package version first needs to be configured inside setup.py then the command python setup.py sdist will build the distribution. The command bdist_wheel should not be used since this mode in setup.py skips including the required static files in the wheel distribution
  • The code can be automatically tested using python setup.py test which requires pytest to be installed
  • Tests can also be run using the Makefile in the root of the repository. The file allows granular testing as follows:
    • make test_core runs only the core tests including pKaMatcher and plots
    • make test_acd only runs acd tests (which require an ACD license)
    • make test runs both core and acd tests
  • Before committing new code, please always check that the formatting is consistent using flake8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pichemist-0.1.0.tar.gz (56.7 kB view details)

Uploaded Source

Built Distribution

pichemist-0.1.0-py3-none-any.whl (74.9 kB view details)

Uploaded Python 3

File details

Details for the file pichemist-0.1.0.tar.gz.

File metadata

  • Download URL: pichemist-0.1.0.tar.gz
  • Upload date:
  • Size: 56.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for pichemist-0.1.0.tar.gz
Algorithm Hash digest
SHA256 82e13caef01f2381a4ac0712f054ca6346a3e46729584608c775cffde0008995
MD5 1e946a42b6444715ce5fc42b1c2317b0
BLAKE2b-256 0dd81621751aa5081e2b1fa62f4cef7de3a2c38a3cad4e4a2b21fc35e08103b0

See more details on using hashes here.

File details

Details for the file pichemist-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pichemist-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 74.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for pichemist-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 002600bc0ea319253e5f04bce3bbb0a5b9dd375b86833898b057fb241e0ecbb5
MD5 09c14e0cca0dd179fa727481a971aaf4
BLAKE2b-256 871abb171094da5c6a6829a2043fc67059297628311a144d79b7aa7dc156d8fd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page