Skip to main content

Preprocessing methylation pipeline, written in python. Easy to use and highly parallelized.

Project description

PyMethylProcess

pymethylprocess_overview

https://github.com/Christensen-Lab-Dartmouth/PyMethylProcess

Wiki: https://github.com/Christensen-Lab-Dartmouth/PyMethylProcess/wiki

Help documentation: https://christensen-lab-dartmouth.github.io/PyMethylProcess/

Alternatively, you can access the pdf: PyMethylProcess.pdf

What is it:

  • Preprocess 450k and 850k methylation IDAT files in parallel using Minfi, ENmix, and meffil
  • Convenient and scalable implementation
  • Imputation and Feature Selection
  • Preparation for machine learning pipelines

Why:

  • Make DNAm accessible to python developers and more machine learning oriented researchers
  • Streamlined analysis makes processing easy

PyMethyProcess is now available in Bioinformatics: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btz594/5542385 .

Getting Started:

  • Installation:
  • Example Usage Scripts (in github repo): Located in ./example_scripts/
  • Help docs (in github repo): https://christensen-lab-dartmouth.github.io/PyMethylProcess/
  • See GitHub wiki for more information on getting started, preprocessing quickly, and setting up and running machine learning analyses / classic methylation analyses (cell type deconvolution, age estimation).
  • Running the CWL tool (temporary instructions until new docker upload):
    • Clone this repository.
    • sh docker_build.sh
    • Then execute the cwl/pymethylprocess.cwl tool using Toil https://toil.readthedocs.io/en/latest/ or Rabix Composer or executor https://github.com/rabix/composer, amongst others.
    • Try this dataset for quick testing: GSE109541
    • Note: This CWL tool has limited functionality, if you would like to see additional functions automated (eg. age calculation, cell type deconvolution, running machine learning pipelines), please submit an issue, and we'll add new features.

NOTE: There have been reported issues with installing PyMethylProcess on Mac OS Mojave (rpy2). If this is the issue, try the docker installation and please report an issue.

CWL Workflow Visualization
PyMethylProcess CWL Pipeline

Benchmark Results: benchmark

Supplementary Figure Removed from Manuscript: Supplemental

Supplemental Figure 1: UMAP embeddings (colored) of: a) GSE87571 (age), b) GSE81961 (disease status), c) GSE69138 (subtype), d) GSE42861 (disease status), e) GSE112179 (brain disorder), f) GSE90496 (subclass), g) TCGA Pancancer (subtype)

pipeline-download
pipeline-format
pipeline-preprocess
pipeline-visualize pipeline-train-test-split

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymethylprocess-0.1.6.tar.gz (40.5 kB view details)

Uploaded Source

File details

Details for the file pymethylprocess-0.1.6.tar.gz.

File metadata

  • Download URL: pymethylprocess-0.1.6.tar.gz
  • Upload date:
  • Size: 40.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.6.8

File hashes

Hashes for pymethylprocess-0.1.6.tar.gz
Algorithm Hash digest
SHA256 7a42609df1596371b6868e0468b2628b188acd207d5be1e3abf61da700caac5a
MD5 a193296fb35ed1a7f0ce7c9b939172d0
BLAKE2b-256 9bd3108a2c4b40bcb8ae320a539dbcdf5a702689d5db5beb3b46447a270e629a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page