Bam2Tensor
Project description
Bam2Tensor
bam2tensor is a Python package for converting .bam files to dense representations of methylation data (as .npz NumPy arrays). It is designed to evaluate all CpG sites and store methylation states for loading into other deep learning pipelines.
Features
- Parses .bam files using pysam
- Extracts methylation data from all CpG sites
- Easily parallelizable
- Supports any genome (Hg38, T2T-CHM13, mm10, etc.)
- Stores methylation data as .npz NumPy arrays
- Stores data in sparse format (COO matrix) for efficient loading
Requirements
- Python 3.8+
- pysam, numpy, scipy, tqdm
Installation
You can install Bam2Tensor via pip from PyPI:
pip install bam2tensor
Usage
Please see the Reference Guide for details.
Contributing
Contributions are welcome! Please see the Contributor Guide.
License
Distributed under the terms of the MIT license, Bam2Tensor is free and open source.
Issues
If you encounter any problems, please file an issue along with a detailed description.
Credits
This project is developed and maintained by Nick Semenkovich (@semenko), as part of the Medical College of Wisconsin's Data Science Institute.
This project was generated from Statistics Norway's SSB PyPI Template.
q
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bam2tensor-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee4e237a319b181f86a0ea8fa0d7f1ecfef4cd2d80db12d89af8fe1b2c76720b |
|
MD5 | 369fca707b943cc4ac9a3cba5e8110d4 |
|
BLAKE2b-256 | b64fabd0f84410cb46d293fbf81e61d4209e1573e3550e0c25e66c9adc9c7faa |