Skip to main content

Peptide classifier for ChEBI / PubChem

Project description

ChemLog is a framework for rule-based ontology extension. This repository implements a classification of peptides on the ChEBI and PubChem datasets.

How are peptides classified?

4 methods for classification are implemented:

  1. Using Monadic Second-Order Logic (MSOL) formulas with the MSOL model finder MONA
  2. Turning an MSOL model finding problem into a QBF satisfiability problem and solving that with CAQE or DepQBF, using the Bloqqer preprocessor.
  3. Turning an MOSL model finding problem partially into First-Order Logic (FOL) and solving that with a custom FOL model checker (since not all MSOL axioms are translatable, the non-translatable parts are calculated algorithmically).
  4. Using an algorithmic implementation

If you are just interested in the results, we recommend choosing the algorithmic implementation, as it is the fastest and can handle complex molecules.

The classification covers the following aspects:

  1. Number of amino acids (up to 10, except for the algorithmic method, which covers arbitrary sizes)
  2. Charge category (either salt, anion, cation, zwitterion or neutral)
  3. Proteinogenic amino acids present
  4. Emericellamides and 2,5-diketopiperazines

ChemLog will also return the ChEBI classes that match this classification. Currently supported are:

ChEBI ID name
16670 peptide
60194 peptide cation
60334 peptide anion
60466 peptide zwitterion
25676 oligopeptide
46761 dipeptide
47923 tripeptide
48030 tetrapeptide
48545 pentapeptide
15841 polypeptide
90799 dipeptide zwitterion
155837 tripeptide zwitterion
64372 emericellamide
65061 2,5-diketopiperazines
24866 salt
25696 organic anion
25697 organic cation
27369 zwitterion

All implementations are based on the same natural language definitions and have been developed jointly. Therefore, it is expected that all methods yield the same result. If you make a different experience, please open an issue.

If you face problems using ChemLog or have other questions, feel free to open an issue as well.

Installation

Download the source code from this repository.

Install with

pip install .

If you want to use the MONA reasoner, you have to install it separately (the classifier expects the mona command to be available).

Run the classification

ChemLog provides a command line interface for the classification. Results are in JSON format for each run, alongside a log and a config file. Currently, classification of ChEBI and PubChem data is supported. Download and preprocessing of the data are handled automatically. For instances, the following command classifies the 1,000 smallest peptides in ChEBI with the algorithmic method:

python -m chemlog classify-chebi --chebi-version 239 --strategy algo --only-peptides --n-molecules 1000

For more details on the available command line options run

python -m chemlog --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemlog-1.0.5.tar.gz (69.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chemlog-1.0.5-py3-none-any.whl (82.8 kB view details)

Uploaded Python 3

File details

Details for the file chemlog-1.0.5.tar.gz.

File metadata

  • Download URL: chemlog-1.0.5.tar.gz
  • Upload date:
  • Size: 69.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for chemlog-1.0.5.tar.gz
Algorithm Hash digest
SHA256 28a0df30d85eb7d7726efe5153c09c2cc1e09401a1a37166296abce3b8789c99
MD5 e167f42299fb2de1d05fbcfccde3f911
BLAKE2b-256 8417379123eda1cc9f1f81d4fe3a9b65c502341e7fd606be44f148b3441eddc4

See more details on using hashes here.

Provenance

The following attestation bundles were made for chemlog-1.0.5.tar.gz:

Publisher: python-publish.yml on sfluegel05/chemlog-peptides

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file chemlog-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: chemlog-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 82.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for chemlog-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 207f1f69ea80b70c1320139fc4d5d3381e71486dac2ffb7512c17270770e34ef
MD5 dbd36d52fd95e0035ea16a4e9bfcb9db
BLAKE2b-256 c5464edb1efeaae49eeb1078308eb0c45a58f43a5ba17d2ecccef4a83bbdbe13

See more details on using hashes here.

Provenance

The following attestation bundles were made for chemlog-1.0.5-py3-none-any.whl:

Publisher: python-publish.yml on sfluegel05/chemlog-peptides

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page