De novo sequencing for DIA mass spectrometry data
Project description
Cascadia
A transformer deep learning model for de novo sequencing of data-independent acquisition mass spectrometry data.
General information
Full documentation and further functionality are still a work in progress. A short demo for running our trained version of Cascadia on your data is available below. Please check back soon for an updated tool!
In the meantime, you can read our preprint here: https://www.biorxiv.org/content/10.1101/2024.06.03.597251v1
Thanks for you patience!
Demo
Dependencies
Cascadia requires the following packages - we recommend using a package manager such as conda to manage your python enviornment:
- lightning >= 2.0
- numpy < 2.0
- torch >= 2.1
- pyteomics
- tqdm
- pandas
- tensorboard
Currently, you need to pull the Cascadia github repo and install the above dependencies yourself. Cascadia will be added to pip and conda soon, which will manage the installation of all dependencies automatically in a couple of minutes.
Run de novo sequencing on new data with a trained model
To run a pretrained Cascadia model on a new dataset, you just need to provide an mzML file. The following example on a small demo dataset should take approximately 5 minutes to run:
python3 cascadia.py \
--mode sequence \
--t demo.mzML \
--checkpoint cascadia.ckpt \
--out demo_results
The demo dataset and model checkpoint is available here. For larger inference jobs, in order to reduce runtime we recommend using a GPU and setting the batch size to the largest value that still fits on GPU memory.
The output is a mztab text file containing one row for each Cascadia prediction. The relevent columns are:
- predicted_sequence
- predicted_score
- precursor_mz
- precursor_charge
- precursor_rt
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cascadia-0.0.1.tar.gz
.
File metadata
- Download URL: cascadia-0.0.1.tar.gz
- Upload date:
- Size: 36.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52b67d6eb9e9e00b83481fa1d3e5631ca2a75b978514ca2cef48a66e8e660ba5 |
|
MD5 | 3dbdac2f5a6ddecf4badc84ce74d39c1 |
|
BLAKE2b-256 | cadf58b7ef109fcb6371f9b150393d317c2d308d7f4a6c1fc34792fad473cc22 |
File details
Details for the file cascadia-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: cascadia-0.0.1-py3-none-any.whl
- Upload date:
- Size: 44.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 806b5a6a8d151c3c93675412a23dbfaa0c055898fa2d4738231737a23a153170 |
|
MD5 | 5c49437c1b43c778ce23462b56d076e2 |
|
BLAKE2b-256 | f7369118772a45e8ae9009733fb5363238919c5483cfa1c1296461a3c6fe6ea0 |