Skip to main content

No project description provided

Project description

CNV From BAM

cnv_from_bam is a Rust library developed to efficiently calculate dynamic Copy Number Variation (CNV) profiles from sequence alignments contained in BAM files. It seamlessly integrates with Python using PyO3, making it an excellent choice for bioinformatics workflows involving genomic data analysis.

Features

  • Efficient Processing: Optimized for handling large genomic datasets in BAM format.
  • Python Integration: Built with PyO3 for easy integration into Python-based genomic analysis workflows.
  • Multithreading Support: Utilizes Rust's powerful concurrency model for improved performance.
  • Dynamic Binning: Bins the genome dynamically based on total read counts and genome length.
  • CNV Calculation: Accurately calculates CNV values for each bin across different contigs.

Installation

To use cnv_from_bam in your Rust project, add the following to your Cargo.toml file:

[dependencies]
cnv_from_bam = "0.1.0"  # Replace with the latest version

Usage

Here's a quick example of how to use the iterate_bam_file function:

use cnv_from_bam::iterate_bam_file;
use std::path::PathBuf;

let bam_path = PathBuf::from("path/to/bam/file.bam");
// Iterate over the BAM file and calculate CNV values for each bin. Number of threads is set to 4 and mapping quality filter is set to 60.
// If number of threads is not specified, it defaults to the number of logical cores on the machine.
let result = iterate_bam_file(bam_path, Some(4), Some(60));
// Process the result...

The results in this case are returned as a CnvResult, which has the following structure:

pub struct CnvResult {
    pub cnv: FnvHashMap<String, Vec<f64>>,
    pub bin_width: usize,
    pub genome_length: usize,
}

Where result.cnv is a hash map containing the Copy Number for each bin of bin_width bases for each contig in the reference genome, result.bin_width is the width of the bins in bases, and result.genome_length is the total length of the genome.

Note that only the main primary mapping alignment start is binned, Supplementary and Secondary alignments are ignored.

Example simple plot in python

from matplotlib import pyplot as plt
import matplotlib as mpl
import numpy as np
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8, 3))
total = 0
let bam_path = PathBuf::from("path/to/bam/file.bam");
# Iterate over the BAM file and calculate CNV values for each bin. Number of threads is set to 4 and mapping quality filter is set to 60.
# If number of threads is not specified, it defaults to the number of logical cores on the machine.
let result = iterate_bam_file(bam_path, Some(4), Some(60));
for contig, cnv in result.cnv.items():
    ax.scatter(x=np.arange(len(cnv)) + total, y=cnv, s =0.1)
    total += len(cnv)

ax.set_ylim((0,8))
ax.set_xlim((0, total))

Should look something like this. Obviously the cnv data is just a dictionary of lists, so you can do whatever you want with it vis a vis matplotlib, seaborn, etc. example cnv plot

Documentation

To generate the documentation, run:

cargo doc --open

Contributing

Contributions to cnv_from_bam are welcome!

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnv_from_bam-0.1.0.tar.gz (516.4 kB view hashes)

Uploaded Source

Built Distributions

cnv_from_bam-0.1.0-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl (752.2 kB view hashes)

Uploaded PyPy macOS 11.0+ ARM64

cnv_from_bam-0.1.0-pp310-pypy310_pp73-macosx_10_7_x86_64.whl (808.5 kB view hashes)

Uploaded PyPy macOS 10.7+ x86-64

cnv_from_bam-0.1.0-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-pp39-pypy39_pp73-macosx_11_0_arm64.whl (752.1 kB view hashes)

Uploaded PyPy macOS 11.0+ ARM64

cnv_from_bam-0.1.0-pp39-pypy39_pp73-macosx_10_7_x86_64.whl (808.4 kB view hashes)

Uploaded PyPy macOS 10.7+ x86-64

cnv_from_bam-0.1.0-pp38-pypy38_pp73-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-pp38-pypy38_pp73-macosx_11_0_arm64.whl (752.4 kB view hashes)

Uploaded PyPy macOS 11.0+ ARM64

cnv_from_bam-0.1.0-pp38-pypy38_pp73-macosx_10_7_x86_64.whl (808.5 kB view hashes)

Uploaded PyPy macOS 10.7+ x86-64

cnv_from_bam-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (753.4 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

cnv_from_bam-0.1.0-cp312-cp312-macosx_10_7_x86_64.whl (809.0 kB view hashes)

Uploaded CPython 3.12 macOS 10.7+ x86-64

cnv_from_bam-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (752.5 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

cnv_from_bam-0.1.0-cp311-cp311-macosx_10_7_x86_64.whl (808.9 kB view hashes)

Uploaded CPython 3.11 macOS 10.7+ x86-64

cnv_from_bam-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (752.5 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

cnv_from_bam-0.1.0-cp310-cp310-macosx_10_7_x86_64.whl (808.8 kB view hashes)

Uploaded CPython 3.10 macOS 10.7+ x86-64

cnv_from_bam-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (753.0 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

cnv_from_bam-0.1.0-cp39-cp39-macosx_10_7_x86_64.whl (809.3 kB view hashes)

Uploaded CPython 3.9 macOS 10.7+ x86-64

cnv_from_bam-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.28+ x86-64

cnv_from_bam-0.1.0-cp38-cp38-macosx_11_0_arm64.whl (753.0 kB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

cnv_from_bam-0.1.0-cp38-cp38-macosx_10_7_x86_64.whl (809.3 kB view hashes)

Uploaded CPython 3.8 macOS 10.7+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page