Skip to main content

Deep Learning for Proteomics

Project description

DLOmix

Docs Build PyPI

DLOmix is a Python framework for Deep Learning in Proteomics. Initially built on top of TensorFlow/Keras, support for PyTorch can however be integrated once the main API is established.

Usage

Experiment a simple retention time prediction use-case using Google Colab    Colab

A version that includes experiment tracking with Weights and Biases is available here    Colab

Resources Repository

More learning resources can be found in the dlomix-resources repository.

Installation

Run the following to install:

$ pip install dlomix

If you would like to use Weights & Biases for experiment tracking and use the available reports for Retention Time under /notebooks, please install the optional wandb python dependency with dlomix by running:

$ pip install dlomix[wandb]

General Overview

  • data: structures for modeling the input data, processing functions, and feature extractions based on Hugging Face datasets Dataset and DatasetDict
  • eval: classes for evaluating models and reporting results
  • layers: custom layers used for building models, based on tf.keras.layers.Layer
  • losses: custom losses to be used for training with model.fit()
  • models: common model architectures for the relevant use-cases based on tf.keras.Model to allow for using the Keras training API
  • pipelines: an exemplary high-level pipeline implementation
  • reports: classes for generating reports related to the different tasks
  • constants.py: constants and configuration values

Use-cases

  • Retention Time Prediction:

    • a regression problem where the retention time of a peptide sequence is to be predicted.
  • Fragment Ion Intensity Prediction:

    • a multi-output regression problem where the intensity values for fragment ions are predicted given a peptide sequence along with some additional features.
  • Peptide Detectability (Pfly) [4]:

    • a multi-class classification problem where the detectability of a peptide is predicted given the peptide sequence.

To-Do

Functionality:

  • integrate prosit
  • integrate hugging face datasets
  • extend data representation to include modifications
  • add PTM features
  • add residual plots to reporting, possibly other regression analysis tools
  • output reporting results as PDF
  • refactor reporting module to use W&B Report API (Retention Time)
  • add additional detectability task
  • extend pipeline for different types of models and backbones
  • extend pipeline to allow for fine-tuning with custom datasets

Package structure:

  • integrate deeplc.py into models.py, preferably introduce a package structure (e.g. models.retention_time)
  • add references for implemented models in the ReadMe
  • introduce formatting and precommit hooks
  • plan documentation (sphinx and readthedocs)
  • refactor following best practices for cleaner install

Developing DLOmix

To install dlomix, along with the tools needed to develop and run tests, run the following command in your virtualenv:

$ pip install -e .[dev]

References:

[Prosit]

[1] Gessulat, S., Schmidt, T., Zolg, D. P., Samaras, P., Schnatbaum, K., Zerweck, J., ... & Wilhelm, M. (2019). Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nature methods, 16(6), 509-518.

[DeepLC]

[2] DeepLC can predict retention times for peptides that carry as-yet unseen modifications Robbin Bouwmeester, Ralf Gabriels, Niels Hulstaert, Lennart Martens, Sven Degroeve bioRxiv 2020.03.28.013003; doi: 10.1101/2020.03.28.013003

[3] Bouwmeester, R., Gabriels, R., Hulstaert, N. et al. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat Methods 18, 1363–1369 (2021). https://doi.org/10.1038/s41592-021-01301-5

[Detectability - Pfly]

[4] Abdul-Khalek, N., Picciani, M., Wimmer, R., Overgaard, M. T., Wilhelm, M., & Gregersen Echers, S. (2024). To fly, or not to fly, that is the question: A deep learning model for peptide detectability prediction in mass spectrometry. bioRxiv, 2024-10.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dlomix-0.1.5.tar.gz (69.7 kB view details)

Uploaded Source

Built Distribution

dlomix-0.1.5-py3-none-any.whl (78.4 kB view details)

Uploaded Python 3

File details

Details for the file dlomix-0.1.5.tar.gz.

File metadata

  • Download URL: dlomix-0.1.5.tar.gz
  • Upload date:
  • Size: 69.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for dlomix-0.1.5.tar.gz
Algorithm Hash digest
SHA256 b2c3022cdaa60134125b289ec956609d2c5510940273e316c9b4082a317f99a1
MD5 b14e71da6fe8b849d9d53cf4f784bcfd
BLAKE2b-256 b61c18a816ad4ca8fe2a7342858ffcefa2d20208ac3fd821aa56ac8a54a71380

See more details on using hashes here.

File details

Details for the file dlomix-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: dlomix-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 78.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for dlomix-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0e7d62c4f7d9c557992c93e549b03f433b37e1f675303545a66fed6fe409357a
MD5 bab8000611b3d5052b24609b308f6b8b
BLAKE2b-256 2ddf5bd34b8adb4163e1a3604e461225e533e74e1cff74417ca7984ed114ee64

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page