Skip to main content

Python package designed to estimate sequencing saturation for reduced-representation bisulfite sequencing (RRBS) data.

Project description

🧬 methurator

Python Versions License: MIT Tested with pytest

methurator is a Python package designed to estimate sequencing saturation for
reduced-representation bisulfite sequencing (RRBS) data.

Although optimized for RRBS, methurator can also be used for whole-genome bisulfite sequencing (WGBS)or other genome-wide methylation data (e.g. EMseq). However, for whole-genome methylation data we advise you to use Preseq package.


🧠 Dependencies and Notes

  • methurator uses SAMtools and MethylDackel internally for BAM subsampling, thus they need to be installed.
  • When --genome is provided, the corresponding FASTA file will be automatically fetched and cached.
  • Temporary intermediate files are deleted by default unless --keep-temporary-files is specified.

📦 Installation

pip install methurator

🚀 Quick Start

Step 1 — Downsample BAM files

The downsample command performs BAM downsampling according to the specified percentages and coverage.

methurator downsample --genome hg19 --bam test_data/SRX1631721.markdup.sorted.csorted.bam

This command generates two summary files:

  • CpG summary — number of unique CpGs detected in each downsampled BAM
  • Reads summary — number of reads in each downsampled BAM

Example outputs can be found in tests/data.


Step 2 — Plot the sequencing saturation curve

Use the plot command to visualize sequencing saturation:

methurator plot \
  --cpgs_file tests/data/cpgs_summary.csv \
  --reads_file tests/data/reads_summary.csv

⚙️ Command Reference

🧩 downsample command

Argument Description Default
--bam Path to a single .bam file.
--bamdir Directory containing multiple BAM files.
--outdir Output directory. ./output
--fasta Path to the reference genome FASTA file. If not provided, it will be automatically downloaded based on --genome.
--genome Genome used for alignment. Available: hg19, hg38, GRCh37, GRCh38, mm10, mm39.
--downsampling-percentages, -ds Comma-separated list of downsampling percentages between 0 and 1 (exclusive). 0.1,0.25,0.5,0.75
--minimum-coverage Minimum CpG coverage to consider for saturation. Can be a single integer or a list (e.g. 1,3,5). 3
--keep-temporary-files If set, temporary files will be kept after analysis. False

📊 plot command

Argument Description Default
--cpgs_file Path to the CpG coverage summary file.
--reads_file Path to the reads coverage summary file.
--outdir Output directory. ./output

📘 Example Workflow

# Step 1: Downsample BAM file
methurator downsample --genome hg19 --bam my_sample.bam

# Step 2: Plot saturation curve
methurator plot \
  --cpgs_file output/cpgs_summary.csv \
  --reads_file output/reads_summary.csv

🧾 Citation

If you use methurator in your research, please cite this repository:

Author(s). methurator: A Python package for estimating sequencing saturation in RRBS data.
https://github.com/yourusername/methurator

🪪 License

This project is licensed under the MIT License — see the LICENSE file for details.


🧑‍💻 Author

Edoardo Giuili GitHubContact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

methurator-0.1.3.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

methurator-0.1.3-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file methurator-0.1.3.tar.gz.

File metadata

  • Download URL: methurator-0.1.3.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for methurator-0.1.3.tar.gz
Algorithm Hash digest
SHA256 05fc1abbaf4cb60135d666e4891f30cce8982831febd3e75b2e46bfc107c5886
MD5 289a0137a1e2d25f3cc1162b2c14e34e
BLAKE2b-256 a0b4f878a1975289126e730de0d7040373f64a66d15b15f0bc88f0e68877875e

See more details on using hashes here.

Provenance

The following attestation bundles were made for methurator-0.1.3.tar.gz:

Publisher: publish.yml on VIBTOBIlab/methurator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file methurator-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: methurator-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for methurator-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7df525ac41a7bfa89c094ee7a990fc0d93d63610ad11cad11decf9a72026b762
MD5 ccf9b8eed3c0aa95c815fba762a0d333
BLAKE2b-256 ca4b6998631b9b3d87b290d74d0a8d7597e94f32f4b1d5dafa497d9276bbc5e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for methurator-0.1.3-py3-none-any.whl:

Publisher: publish.yml on VIBTOBIlab/methurator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page