Skip to main content

Python package for processing Infinum DNA methylation arrays

Project description

Mepylome Logo

Mepylome: Methylation Array Analysis Toolkit

Mepylome is an efficient Python toolkit tailored for parsing, processing, and analyzing methylation array IDAT files. Serving as a versatile library, Mepylome supports a wide range of methylation analysis tasks. It also includes an interactive GUI that enables users to generate UMAP plots and CNV plots (Copy Number Variation) directly from collections of IDAT files.

Note: Mepylome is still under construction.

Features

  • Support for Illumina array types: 450k, EPIC, EPICv2
  • Parsing of IDAT files
  • Extraction of methylation signals
  • Calculation of copy number variations (CNV) with Plotly plots visualization
  • Methylation analysis tool with a graphical browser interface for UMAP analysis and CNV plots
    • Can be run from the command line with minimal setup or customized through a Python script

Documentation

The mepylome documentation, including installation instructions, tutorial and API, is available at https://mepylome.readthedocs.io/

Installation

From PyPI

You can install mepylome directly from PyPI using pip:

pip install mepylome

From Source

If you want the latest version, you can download mepylome directly from the source:

git clone https://github.com/brj0/mepylome.git && cd mepylome && pip install .

CNV Segments

To perform segmentation on the CNV plot (horizontal lines identifying significant changes), additional packages are required. These packages depend on a C compiler. Follow the instructions below to install them based on your Python version.

For Python < 3.10, install the necessary packages using the following command:

pip install numpy==1.26.4 cython ailist==1.0.4 cbseg

For Python 3.10 and Later, you can install the linear_segment package instead. Use the following command:

pip install linear_segment

Make sure you have a C compiler installed on your system to build these packages.

Usage

Methylation extraction and copy number variation plots

from pathlib import Path

from mepylome import CNV, MethylData

# Sample
analysis_dir = Path("/path/to/idat/directory")
sample_file = analysis_dir / "200925700125_R07C01"

# CNV neutral reference files
reference_dir = Path("/path/to/reference/directory")

# Get methylation data
sample_methyl = MethylData(file=sample_file)
reference_methyl = MethylData(file=reference_dir)

# Beta value
betas = sample_methyl.betas

# Print overview of processed data
print(sample_methyl)

# CNV anylsis
cnv = CNV.set_all(sample_methyl, reference_methyl)

# Visualize CNV in the browser
cnv.plot()

Methylation analysis: Command-line interface

To perform the analysis, you must define an analysis directory that contains the IDAT files you want to analyze. Additionally, you need an annotation file (preferably in CSV format rather than XLSX) with a header where the first column is the Sentrix ID. It is best to place this annotation file within the analysis directory. Furthermore, you should have a directory with CNV-neutral reference cases for CNV analysis.

Basic usage:

To start the interface, run the following command (you'll need to manually copy directories into the interface):

mepylome

Prefered usage:

For a more streamlined experience, specify the analysis IDAT files directory, reference IDAT directory, and CpG array type. This command also improves UMAP speed by saving betas to disk:

mepylome -a /path/to/idats -r /path/to/ref -c 450k -s

Show All Parameters

To display all available command-line parameters, use:

mepylome --help

C++ parser

Mepylome also includes a C++ parser (_IdatParser) with Python bindings. Due to no significant speed gain, it is currently not included by default. To enable it, install from source after you execute the following command:

export MEPYLOME_CPP=1

Contributing

Contributions are welcome! If you have any bug reports, feature requests, or suggestions, please open an issue or submit a pull request.

License

This project is licensed under the GPL-3.0 license.

Acknowledgements

Mepylome is strongly influenced by minfi and conumee2. Some functionalities, such as the manifest handler and parser, are adapted from methylprep.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mepylome-0.1.1.tar.gz (837.8 kB view details)

Uploaded Source

Built Distribution

mepylome-0.1.1-py3-none-any.whl (841.0 kB view details)

Uploaded Python 3

File details

Details for the file mepylome-0.1.1.tar.gz.

File metadata

  • Download URL: mepylome-0.1.1.tar.gz
  • Upload date:
  • Size: 837.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for mepylome-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5133c29147fba874bbe244a29d26e67806135977b5d0defb2b478bf9dff3eefd
MD5 d80000c4d7a867a7757d124fdc6ad241
BLAKE2b-256 3b68bacd7468122fcc385921209d76f6d48a90a9ec00cd574d53a94da7c718c7

See more details on using hashes here.

Provenance

File details

Details for the file mepylome-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mepylome-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 841.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for mepylome-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 62885aec55e9f9613b4287fb4ccc27f6e11e53887a197b3d528293211ec5c3ee
MD5 43e23354ad1fe5ff3323b70c90d8e57f
BLAKE2b-256 292de036e455db1c06f8ac78a2247f1a88ace5e8c4c9deac63ac307d48d05723

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page