Download and parse curated HDX-MS datasets

These details have not been verified by PyPI

Project links

Project description

HDXMS Datasets

Welcome to the HDXMS datasets repository.

The hdxms-datasets package provides tools handling HDX-MS datasets.

The package offers the following features:

Defining datasets and their experimental metadata
Verification of datasets and metadata
Loading datasets from local or remote database
Conversion of datasets from various formats (e.g., DynamX, HDExaminer) to a standardized format
Propagation of standard deviations from replicates to fractional relative uptake values

A database for open HDX datasets is set up at HDXMS DataBase

There is an example front-end available featuring real-time estimation of HDX-MS ΔG values called instaGibbs

Installation

pip install hdxms-datasets

Example Usage

Loading datasets

from hdxms_datasets import DataBase

db = DataBase('path/to/local_db')
dataset = db.get_dataset('HDX_D9096080')

# Protein identifier information
print(dataset.protein_identifiers.uniprot_entry_name)
#> 'SECB_ECOLI'

# Access HDX states 
print([state.name for state in dataset.states])
#> ['Tetramer', 'Dimer']

# Get the sequence of the first state
state = dataset.states[0]
print(state.protein_state.sequence)
#> 'MSEQNNTEMTFQIQRIYT...'

# Load peptides
peptides = state.peptides[0]

# Access peptide information
print(peptides.deuteration_type, peptides.pH, peptides.temperature)
#> DeuterationType.partially_deuterated 8.0 303.15

# Load the peptide table as standardized narwhals DataFrame
df = peptides.load(
    convert=True,  # convert column header names to open hdx stanard
    aggregate=True, # aggregate centroids / uptake values across replicates
)

print(df.columns)
#> ['start', 'end', 'sequence', 'state', 'exposure', 'centroid_mz', 'rt', 'rt_sd', 'uptake', ...

Define and process datasets

from hdxms_datasets import ProteinState, Peptides, verify_sequence, merge_peptides, compute_uptake_metrics

# Define the protein state
protein_state = ProteinState(
    sequence="MSEQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDWQPEVKLDLDTASSQLADDVYEVVLRVTVTASLGEETAFLCEVQQGGIFSIAGIEGTQMAHCLGAYCPNILFPYARECITSMVSRGTFPQLNLAPVNFDALFMNYLQQQAGEGTEEHQDA",
    n_term=1,
    c_term=155,
    oligomeric_state=4,
)

# Define the partially deuterated peptides for the SecB state
pd_peptides = Peptides(
    # path to the data file
    data_file=data_dir / "ecSecB_apo.csv",
    # specify the data format
    data_format=PeptideFormat.DynamX_v3_state,
    # specify the deuteration type (partially, fully or not deuterated)
    deuteration_type=DeuterationType.partially_deuterated,
    filters={
        "State": "SecB WT apo",
        # Optionally filter by exposure, leave out to include all exposures
        "Exposure": [0.167, 0.5, 1.0, 10.0, 100.000008],
    },
    # pH read without corrections
    pH=8.0,
    # temperature of the exchange buffer
    temperature=303.15,
    # deuterium percentage of the exchange buffer
    d_percentage=90.0,
)

# check for difference between the protein state sequence and the peptide sequences
mismatches = verify_sequence(pd_peptides.load(), protein_state.sequence, n_term=protein_state.n_term)
print(mismatches)
#> [] # sequences match

# Define the fully deuterated peptides for the SecB state
fd_peptides = Peptides(
    data_file=data_dir / "ecSecB_apo.csv",
    data_format=PeptideFormat.DynamX_v3_state,
    deuteration_type=DeuterationType.fully_deuterated,
    filters={
        "State": "Full deuteration control",
        "Exposure": 0.167,
    },
)

# merge both peptides together in a single dataframe
merged = merge_peptides([pd_peptides, fd_peptides])
print(merged.columns)
#> ['start', 'end', 'sequence', ... 'uptake', 'uptake_sd', 'fd_uptake', 'fd_uptake_sd']

# compute uptake metrics for the merged peptides
# this function computes uptake from centroid mass if not present
# as well as fractional uptake
processed = compute_uptake_metrics(merged)
print(processed.columns)
#> ['start', 'end', 'sequence', ... 'uptake', 'uptake_sd', 'fd_uptake', 'fd_uptake_sd', 'fractional_uptake', 'fractional_uptake_sd']

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.3

Dec 17, 2025

This version

0.3.2

Dec 17, 2025

0.3.1

Dec 9, 2025

0.3.0

Dec 9, 2025

0.2.0

Jun 12, 2025

0.1.5

Feb 21, 2024

0.1.4

Jan 11, 2024

0.1.3

Aug 8, 2023

0.1.2

Jul 14, 2023

0.1.1

Jan 2, 2023

0.1.0

Jan 2, 2023

0.1.0a1 pre-release

Dec 21, 2022

0.0.0

Oct 25, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hdxms_datasets-0.3.2.tar.gz (6.9 MB view details)

Uploaded Dec 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hdxms_datasets-0.3.2-py3-none-any.whl (81.9 kB view details)

Uploaded Dec 17, 2025 Python 3

File details

Details for the file hdxms_datasets-0.3.2.tar.gz.

File metadata

Download URL: hdxms_datasets-0.3.2.tar.gz
Upload date: Dec 17, 2025
Size: 6.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hdxms_datasets-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`34944a520fe49b0c8583fd26fcfd36a48e5fe9000c161f5b6f116ad93970897a`
MD5	`6cc70efe1c325364ae50545c59737a91`
BLAKE2b-256	`e4a5c75b711a6f889b67e96cc698c2669a8588591d3245dff23078fff455cb9e`

See more details on using hashes here.

File details

Details for the file hdxms_datasets-0.3.2-py3-none-any.whl.

File metadata

Download URL: hdxms_datasets-0.3.2-py3-none-any.whl
Upload date: Dec 17, 2025
Size: 81.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hdxms_datasets-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`01f9b32771bc8c1aec5308c8a3df765cd24326f722dca2c6b8f88d9ec9a69451`
MD5	`b7d3a4c0867b80bd4d39846a68b6219e`
BLAKE2b-256	`5609a0bbbb7f4b23903d41265d19ef361264085815d3dabf759cd0a1a1254992`

See more details on using hashes here.

hdxms-datasets 0.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HDXMS Datasets

Installation

Example Usage

Loading datasets

Define and process datasets

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes