mdb: population-level DNA methylation analysis toolkit
Project description
mdb
mdb builds and queries CpG-by-sample methylation matrices from ONT and PacBio BED inputs.
- PyPI package:
methdb - CLI command:
mdb
Install
pip install .
Verify:
mdb --help
mdb --version
Core Concepts
- Sample bundle (
.smdb): one sample, multiple track views (assay/haplotype/strand). - Cohort store (
.mmdb): merged sample bundles for population-scale queries. - Backends:
zarr(default, compressed, block-aligned merge writes)npy(optional compatibility backend)
Quick Start
1) Build CpG index
mdb index -r GRCh38_no_alt.fa -o GRCh38.cpg_index.npz
Include chrX/chrY:
mdb index -r GRCh38_no_alt.fa -o GRCh38.cpg_index.npz --sex
2) Create sample bundle
ONT (modkit output file or directory):
mdb create \
-p ont \
-n GRCh38.cpg_index.npz \
-b /path/to/ont_input \
-o sample_ont.smdb \
-c 5 \
--sample-id SAMPLE_ONT
PacBio (prefix or directory):
mdb create \
-p pacbio \
-n GRCh38.cpg_index.npz \
-b /path/to/pacbio_prefix \
-o sample_pb.smdb \
-c 5 \
--sample-id SAMPLE_PB
3) Merge sample bundles into a cohort
Default backend (zarr):
mdb merge \
-i sample_ont.smdb sample_pb.smdb \
-o cohort.mmdb \
--workers 2 \
--block-size 64 \
--zarr-row-chunk 65536 \
--zarr-codec zstd \
--zarr-clevel 5 \
--zarr-shuffle bitshuffle \
--zarr-codec-threads 4
NPY backend (explicit):
mdb merge \
-i sample_ont.smdb sample_pb.smdb \
-o cohort_npy.mmdb \
--cohort-backend npy \
--workers 2 \
--block-size 64
Build modifiedC view (5mC + 5hmC where available):
mdb merge -i sample_ont.smdb sample_pb.smdb -o cohort_modifiedc.mmdb -m
4) Append new samples to existing cohort
mdb append \
-c cohort.mmdb \
-i new_sample1.smdb new_sample2.smdb
5) Query values
Point query:
mdb query \
-i cohort.mmdb \
--sample-id SAMPLE_PB \
--assay 5mC \
--haplotype combined \
--strand combined \
--locus chr1:10469
Range query:
mdb query \
-i cohort.mmdb \
--sample-id SAMPLE_PB \
--assay 5mC \
--haplotype combined \
--strand combined \
--region chr1:10469-12000
Important Notes
create --readercurrently defaults toscanand the active create path uses scan-based reading.mergeandappendrequire sample bundles created by currentmdb create(manifest-based.smdblayout).pcais a legacy command path that expects flat merged.npymatrix layout, not the current view-based cohort store.
License
MIT (LICENSE).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file methdb-0.0.3.tar.gz.
File metadata
- Download URL: methdb-0.0.3.tar.gz
- Upload date:
- Size: 39.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56b85bc0b78c1d813c0b5c64878959a3a9018f85671318409c968492390f5d64
|
|
| MD5 |
6625c773760eb74800de0672de9adef3
|
|
| BLAKE2b-256 |
5492ee484fe68554d04ffe3814996d351329d49d6cd713c5a6342ca20f7c57a5
|
Provenance
The following attestation bundles were made for methdb-0.0.3.tar.gz:
Publisher:
publish-pypi.yml on Fu-Yilei/mdb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
methdb-0.0.3.tar.gz -
Subject digest:
56b85bc0b78c1d813c0b5c64878959a3a9018f85671318409c968492390f5d64 - Sigstore transparency entry: 1037113715
- Sigstore integration time:
-
Permalink:
Fu-Yilei/mdb@19dd1281f83fd3a8d9561443d2ebb945b7515118 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Fu-Yilei
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@19dd1281f83fd3a8d9561443d2ebb945b7515118 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file methdb-0.0.3-py3-none-any.whl.
File metadata
- Download URL: methdb-0.0.3-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96228a15044b98b0855fab423c40cd86d9c8329b9957300d3b65d9797c953e7c
|
|
| MD5 |
787d87d3cdf466ab24957e40af66221e
|
|
| BLAKE2b-256 |
fac93cb000cf90e2f5ef657f415f85ccd3827823bf608e64225ae57c5c4611c8
|
Provenance
The following attestation bundles were made for methdb-0.0.3-py3-none-any.whl:
Publisher:
publish-pypi.yml on Fu-Yilei/mdb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
methdb-0.0.3-py3-none-any.whl -
Subject digest:
96228a15044b98b0855fab423c40cd86d9c8329b9957300d3b65d9797c953e7c - Sigstore transparency entry: 1037113772
- Sigstore integration time:
-
Permalink:
Fu-Yilei/mdb@19dd1281f83fd3a8d9561443d2ebb945b7515118 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Fu-Yilei
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@19dd1281f83fd3a8d9561443d2ebb945b7515118 -
Trigger Event:
workflow_dispatch
-
Statement type: