Deep Learning for Proteomics
Project description
DLOmix
DLOmix is a Python framework for Deep Learning in Proteomics. Initially built on top of TensorFlow/Keras, support for PyTorch can however be integrated once the main API is established.
Usage
Experiment a simple retention time prediction use-case using Google Colab
A version that includes experiment tracking with Weights and Biases is available here
Resources Repository
More learning resources can be found in the dlomix-resources repository.
Installation
Run the following to install:
$ pip install dlomix
If you would like to use Weights & Biases for experiment tracking and use the available reports for Retention Time under /notebooks
, please install the optional wandb
python dependency with dlomix
by running:
$ pip install dlomix[wandb]
General Overview
data
: structures for modeling the input data, processing functions, and feature extractions based on Hugging Face datasetsDataset
andDatasetDict
eval
: classes for evaluating models and reporting resultslayers
: custom layers used for building models, based ontf.keras.layers.Layer
losses
: custom losses to be used for training withmodel.fit()
models
: common model architectures for the relevant use-cases based ontf.keras.Model
to allow for using the Keras training APIpipelines
: an exemplary high-level pipeline implementationreports
: classes for generating reports related to the different tasksconstants.py
: constants and configuration values
Use-cases
-
Retention Time Prediction:
- a regression problem where the retention time of a peptide sequence is to be predicted.
-
Fragment Ion Intensity Prediction:
- a multi-output regression problem where the intensity values for fragment ions are predicted given a peptide sequence along with some additional features.
-
Peptide Detectability (Pfly) [4]:
- a multi-class classification problem where the detectability of a peptide is predicted given the peptide sequence.
To-Do
Functionality:
- integrate prosit
- integrate hugging face datasets
- extend data representation to include modifications
- add PTM features
- add residual plots to reporting, possibly other regression analysis tools
- output reporting results as PDF
- refactor reporting module to use W&B Report API (Retention Time)
- add additional detectability task
- extend pipeline for different types of models and backbones
- extend pipeline to allow for fine-tuning with custom datasets
Package structure:
- integrate
deeplc.py
intomodels.py
, preferably introduce a package structure (e.g.models.retention_time
) - add references for implemented models in the ReadMe
- introduce formatting and precommit hooks
- plan documentation (sphinx and readthedocs)
- refactor following best practices for cleaner install
Developing DLOmix
To install dlomix, along with the tools needed to develop and run tests, run the following command in your virtualenv:
$ pip install -e .[dev]
References:
[Prosit]
[1] Gessulat, S., Schmidt, T., Zolg, D. P., Samaras, P., Schnatbaum, K., Zerweck, J., ... & Wilhelm, M. (2019). Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nature methods, 16(6), 509-518.
[DeepLC]
[2] DeepLC can predict retention times for peptides that carry as-yet unseen modifications Robbin Bouwmeester, Ralf Gabriels, Niels Hulstaert, Lennart Martens, Sven Degroeve bioRxiv 2020.03.28.013003; doi: 10.1101/2020.03.28.013003
[3] Bouwmeester, R., Gabriels, R., Hulstaert, N. et al. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat Methods 18, 1363–1369 (2021). https://doi.org/10.1038/s41592-021-01301-5
[Detectability - Pfly]
[4] Abdul-Khalek, N., Picciani, M., Wimmer, R., Overgaard, M. T., Wilhelm, M., & Gregersen Echers, S. (2024). To fly, or not to fly, that is the question: A deep learning model for peptide detectability prediction in mass spectrometry. bioRxiv, 2024-10.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dlomix-0.1.5.tar.gz
.
File metadata
- Download URL: dlomix-0.1.5.tar.gz
- Upload date:
- Size: 69.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2c3022cdaa60134125b289ec956609d2c5510940273e316c9b4082a317f99a1 |
|
MD5 | b14e71da6fe8b849d9d53cf4f784bcfd |
|
BLAKE2b-256 | b61c18a816ad4ca8fe2a7342858ffcefa2d20208ac3fd821aa56ac8a54a71380 |
File details
Details for the file dlomix-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: dlomix-0.1.5-py3-none-any.whl
- Upload date:
- Size: 78.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e7d62c4f7d9c557992c93e549b03f433b37e1f675303545a66fed6fe409357a |
|
MD5 | bab8000611b3d5052b24609b308f6b8b |
|
BLAKE2b-256 | 2ddf5bd34b8adb4163e1a3604e461225e533e74e1cff74417ca7984ed114ee64 |