Skip to main content

LC-MS metabolomics data preprocessing

Project description

Asari

Python program for high-resolution LC-MS metabolomics data preprocessing, with a design focus to be trackable and scalable.

  • only for high resolution data. Prioritized leverage of high mass resolution.
  • Simple peak dection based on local maxima and prominence.
  • Tracking peak quality, selectiviy (on m/z, database, elution).
  • reproducible, trackable from features to XICs
  • Peaks of high quality and selectivity are aligned via formula mass and epdTrees.
  • Fast assembly and annotation of serum/plasma metabolomes based on a reference database.
  • Use integers for RT scan numbers and intensities for computing efficiency.
  • Avoid mathematical curves whereas possible for computing efficiency.
  • Performance conscious, memory and CPU uses scalable.

Basic concepts follow https://github.com/shuzhao-li/metDataModel, as

├── Experiment
   ├── Sample
       ├── MassTrack
           ├── Peak
       ├── MassTrack 
           ├── Peak
           ├── Peak
    ...
   ├── Sample 
    ...
   ├── Sample 

A sample here is an injection in LC-MS experiments. A MassTrace is an extracted chromatogram (EIC or XIC). Peak is specific to a sample, but a feature is defined per experiment.

This uses mass2chem and JMS for mass search and annotation functions.

Algorithms

  • Chromatogram construction is based on m/z values via flexible bins and frequency counts (in lieu histograms).
  • Peak dection based on local maxima and prominence.
  • Align (correspondence) peaks of high selectivity in both measured data and in reference database via formula mass.
  • Use information on observed features and epdTrees in the first few samples to guide data extraction and assembly in the remaining data.

Each sample is checked for mass accuracy. Each sample has a recorded function of mass calibration and a function of RT calibration.

Selectivity is tracked for

  • mSelectivity, how distinct are m/z measurements
  • cSelectivity, how distinct are chromatograhic elution peaks
  • dSelectivity, how distinct are database records

Use

Currently python3 -m asari.main pos mydir/projectx_dir

The two arguments are ionization_mode and data_directory.

Next to-do

The reference DB is not finalized. Add SQLite DB for storage.

Repository

https://github.com/shuzhao-li/asari

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asari-metabolomics-0.9.10.tar.gz (6.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

asari_metabolomics-0.9.10-py3-none-any.whl (12.3 MB view details)

Uploaded Python 3

File details

Details for the file asari-metabolomics-0.9.10.tar.gz.

File metadata

  • Download URL: asari-metabolomics-0.9.10.tar.gz
  • Upload date:
  • Size: 6.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.7.4

File hashes

Hashes for asari-metabolomics-0.9.10.tar.gz
Algorithm Hash digest
SHA256 b878fe8c8b0feb6fea5c95394a4167a754e0c253c22c7cddbbcd9880a7a137bb
MD5 cff430659a4baa0698301a2a0f1b528f
BLAKE2b-256 8d24ba6f1714ff9c046c58d9c5eec3072af3c212f5043d9c0696fe587a967acf

See more details on using hashes here.

File details

Details for the file asari_metabolomics-0.9.10-py3-none-any.whl.

File metadata

  • Download URL: asari_metabolomics-0.9.10-py3-none-any.whl
  • Upload date:
  • Size: 12.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.7.4

File hashes

Hashes for asari_metabolomics-0.9.10-py3-none-any.whl
Algorithm Hash digest
SHA256 c53db630b881ace09b400871e2168336c94fcee1e6d17269ed8c63491d2732f9
MD5 e8fb082cf59d99feb6ce347ae63de6ea
BLAKE2b-256 e078805e5b216840deb7fe1d4f5c25bab3c20b0a8dcdd203755f1e752b3dad0a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page