Parser for mwtab files from the Metabolomics Workbench
The mwtab package is a Python library that facilitates reading and writing files in mwTab format used by the Metabolomics Workbench for archival of Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) experimental data.
The mwtab package can be used in several ways:
- As a library for accessing and manipulating data stored in mwTab format files.
- As a command-line tool to convert between mwTab format and its equivalent JSON representation.
When using mwtab package in published work, please cite the following paper:
- Smelter, Andrey and Hunter NB Moseley. “A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository.” Metabolomics 2018, 14(5): 64. doi: 10.1007/s11306-018-1356-6.
Install on Linux, Mac OS X
python3 -m pip install mwtab
Install on Windows
py -3 -m pip install mwtab
Upgrade on Linux, Mac OS X
python3 -m pip install mwtab --upgrade
Upgrade on Windows
py -3 -m pip install mwtab --upgrade
>>> import mwtab >>> >>> # Here we use ANALYSIS_ID of file to fetch data from URL >>> for mwfile in mwtab.read_files("1", "2"): ... print("STUDY_ID:", mwfile.study_id) ... print("ANALYSIS_ID:", mwfile.analysis_id) ... print("SOURCE:", mwfile.source) ... print("Blocks:", list(mwfile.keys())) >>>
Read the User Guide and the mwtab Tutorial on ReadTheDocs to learn more and to see code examples on using the mwtab as a library and as a command-line tool.
File Formatting Issues
Currently there are 5 files that are failing to parse due to formatting issues within them:
- extra tab character on line 360 (‘MS_ALL_DATA:UNITS tt’)
- ST:EMAIL line is broken on line 53, 54 (‘ST:EMAIL email@example.com’)
- extra tab on line 155 (‘NMR_BINNED_DATA:UNITStppmt’)
- extra tab character on line 135 (‘CH:CHROMATOGRAPHY_SUMMARY ttThe gradient composition was changed linearly from 50% to 100% solvent B’)
- extra tab character on lines 61-78 (‘SP:SAMPLEPREP_SUMMARY tPreparation of SPE on vacuum manifold: 1.tClean 60 mg Oasis HLB (Waters) spe …’)
- Header line is broken into two lines on lines 1-2 (‘#METABOLOMICS WORKBENCH hover_20170726_173354 DATATRACK_ID:1171n STUDY_ID:ST000902 ANALYSIS_ID:AN001468’)
This package is distributed under the BSD license.