Skip to main content

De novo sequencing for DIA mass spectrometry data

Project description

Cascadia

A transformer deep learning model for de novo sequencing of data-independent acquisition mass spectrometry data.

General information

Full documentation and further functionality are still a work in progress. A short demo for running our trained version of Cascadia on your data is available below. Please check back soon for an updated tool!

In the meantime, you can read our preprint here: https://www.biorxiv.org/content/10.1101/2024.06.03.597251v1

Thanks for you patience!

Demo

Dependencies

Cascadia requires the following packages - we recommend using a package manager such as conda to manage your python enviornment:

  • lightning >= 2.0
  • numpy < 2.0
  • torch >= 2.1
  • pyteomics
  • tqdm
  • pandas
  • tensorboard

Currently, you need to pull the Cascadia github repo and install the above dependencies yourself. Cascadia will be added to pip and conda soon, which will manage the installation of all dependencies automatically in a couple of minutes.

Run de novo sequencing on new data with a trained model

To run a pretrained Cascadia model on a new dataset, you just need to provide an mzML file. The following example on a small demo dataset should take approximately 5 minutes to run:

python3 cascadia.py \
  --mode sequence \
  --t demo.mzML  \
  --checkpoint cascadia.ckpt \
  --out demo_results

The demo dataset and model checkpoint is available here. For larger inference jobs, in order to reduce runtime we recommend using a GPU and setting the batch size to the largest value that still fits on GPU memory.

The output is a mztab text file containing one row for each Cascadia prediction. The relevent columns are:

  • predicted_sequence
  • predicted_score
  • precursor_mz
  • precursor_charge
  • precursor_rt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cascadia-0.0.1.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

cascadia-0.0.1-py3-none-any.whl (44.5 kB view details)

Uploaded Python 3

File details

Details for the file cascadia-0.0.1.tar.gz.

File metadata

  • Download URL: cascadia-0.0.1.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for cascadia-0.0.1.tar.gz
Algorithm Hash digest
SHA256 52b67d6eb9e9e00b83481fa1d3e5631ca2a75b978514ca2cef48a66e8e660ba5
MD5 3dbdac2f5a6ddecf4badc84ce74d39c1
BLAKE2b-256 cadf58b7ef109fcb6371f9b150393d317c2d308d7f4a6c1fc34792fad473cc22

See more details on using hashes here.

File details

Details for the file cascadia-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: cascadia-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 44.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for cascadia-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 806b5a6a8d151c3c93675412a23dbfaa0c055898fa2d4738231737a23a153170
MD5 5c49437c1b43c778ce23462b56d076e2
BLAKE2b-256 f7369118772a45e8ae9009733fb5363238919c5483cfa1c1296461a3c6fe6ea0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page