Bam2Tensor

These details have not been verified by PyPI

Project links

Project description

bam2tensor

bam2tensor is a Python package for converting .bam files to dense representations of methylation data (as .npz NumPy arrays). It is designed to evaluate all CpG sites and store methylation states for loading into other deep learning pipelines.

bam2tensor logo

Features

Parses .bam files using pysam
Extracts methylation data from all CpG sites
Supports any genome (Hg38, T2T-CHM13, mm10, etc.)
Stores data in sparse format (COO matrix) for efficient loading
Exports methylation data to .npz NumPy arrays
Easily parallelizable

Requirements

Python 3.9+
pysam, numpy, scipy, tqdm

Installation

You can install bam2tensor via pip from PyPI:

pip install bam2tensor

Usage

Please see the Reference Guide for full details.

Data Structure

One .npz file is generated for each separate .bam, which can be loaded using scipy.sparse.load_npz(). Each .npz file contains a single sparse SciPy COO matrix.

In the COO matrix, each row represents a read and each column represents a CpG site. The value at each row/column is the methylation state (0 = unmethylated, 1 = methylated, -1 = no data). Note that -1 can represent indels or point mutations.

Todo

Consider storing a Read ID: Row ID mapping?
Export / more stably store & import embedding mapping? (.npz or other instead of .json?)
Store metadata / object reference in .npz file?

Contributing

Contributions are welcome! Please see the Contributor Guide.

License

Distributed under the terms of the MIT license, bam2tensor is free and open source.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project is developed and maintained by Nick Semenkovich (@semenko), as part of the Medical College of Wisconsin's Data Science Institute.

This project was generated from Statistics Norway's SSB PyPI Template.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3

Feb 16, 2024

1.2

Feb 16, 2024

1.1

Jan 25, 2024

1.0.1

Jan 20, 2024

1.0.0

Jan 19, 2024

0.0.4

Jan 18, 2024

0.0.3

Jan 17, 2024

0.0.2

Jan 3, 2024

0.0.1

Dec 15, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bam2tensor-1.3.tar.gz (15.0 kB view details)

Uploaded Feb 16, 2024 Source

Built Distribution

bam2tensor-1.3-py3-none-any.whl (14.3 kB view details)

Uploaded Feb 16, 2024 Python 3

File details

Details for the file bam2tensor-1.3.tar.gz.

File metadata

Download URL: bam2tensor-1.3.tar.gz
Upload date: Feb 16, 2024
Size: 15.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for bam2tensor-1.3.tar.gz
Algorithm	Hash digest
SHA256	`4114149d526c895e1506f8fde9259772c66f11a4e893eb16a0262418425fe738`
MD5	`a8077fa772f5c2d0203b68613f603b62`
BLAKE2b-256	`d8d543b328996dd0c1693fe506240fc3437069423abeb41b80a5287cad296af9`

See more details on using hashes here.

File details

Details for the file bam2tensor-1.3-py3-none-any.whl.

File metadata

Download URL: bam2tensor-1.3-py3-none-any.whl
Upload date: Feb 16, 2024
Size: 14.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for bam2tensor-1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ff29f5b88186a42ca6e48efde6aea7cffe3a6b91b15e875de0c5b274a3c7a40`
MD5	`cbc09e8e6b5129cb49973d10a350c866`
BLAKE2b-256	`4df84cab667140752bd2108ed8e4e5f45ad834af932e19a1c4310156a7ced0c6`

See more details on using hashes here.

bam2tensor 1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

bam2tensor

Features

Requirements

Installation

Usage

Data Structure

Todo

Contributing

License

Issues

Credits

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes