Skip to main content

A method to split BAM files for celltypes

Project description

Bam2cell

Latest PyPI Version Downloads tests DOI

A package to split a BAM file based on cell type annotation in AnnData objects.

Usage and Examples

There are two modes sequential and parallel. The sequential mode will process cell types one by one but is more disk space friendly, the parallel is more disk space hungry but faster, since it process all cell types at the same time.

A minimal example is shown here:

⚠️ Note: The barcodes should not contain suffix or prefix. Use `clean_bcs()` to remove them.

import bam2cell
import anndata as ad

adata = ad.read_h5ad("data/adata.h5ad")

generator = bam2cell.GenerateCellTypeBAM(adata, 
                                         annot_key="annotation",
                                         output_path="data/",
                                         input_bam="data/AllCellsSorted_toy.bam",
                                         tmp_path="data/",
                                         workers=8,
                                         )
generator.process_all_parallel()  # Case 1 - Process all cell types at the same time
generator.process_cts_sequential() # Case 2 - Process cell types one by one

For a more advanced usage, you can use the function bam2cell, which allow to process an AnnData with multiple samples.

import bam2cell
import anndata as ad
import pandas as pd

adata = ad.read_h5ad("data/adata.h5ad")
artificial_batch = ["batch1"] * 100 + ["batch2"] * 91
adata.obs["batch"] = pd.Categorical(artificial_batch)
adata.obs["bam_path"] = "data/AllCellsSorted_toy.bam"

bam2cell.bam2cell(adata,
                  annot_key="annotation",
                  input_bam=None,  # Only when we have 1 batch in the AnnData
                  output_path="data/",  
                  tmp_path="data/",
                  bam_key="bam_path",  # For each barcode we have the path to the BAM file
                  batch_key="batch",  
                  mode="parallel",
                  suffix=None,  # Suffix in the barcode to be removed (e.g., BC-1-suffix --> BC-1)
                  prefix=None,  # Prefix in the BC to be removed (e.g., prefix-BC-1 --> BC-1) 
                  workers=8
                  )

Installation

You need to have Python 3.10 or newer installed on your system. There are several alternative options to install bam2cell:

  1. Install the latest release of bam2cell from PyPI:
pip install bam2cell  
  1. Install the latest development version:
pip install git+https://github.com/davidrm-bio/bam2cell.git@main

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bam2cell-0.2.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bam2cell-0.2-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file bam2cell-0.2.tar.gz.

File metadata

  • Download URL: bam2cell-0.2.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bam2cell-0.2.tar.gz
Algorithm Hash digest
SHA256 7d46b1bb779dafd751312556ccc2b9f7d6d73e4231b9872424e0c084e2af3751
MD5 0772a20bc2c6791502ac84c885793486
BLAKE2b-256 f6d9cba5001c48dea2a5d01a0ffa601ce303c9047ba6949b2df2c36f5474518d

See more details on using hashes here.

Provenance

The following attestation bundles were made for bam2cell-0.2.tar.gz:

Publisher: release.yml on davidrm-bio/bam2cell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bam2cell-0.2-py3-none-any.whl.

File metadata

  • Download URL: bam2cell-0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bam2cell-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fd266d013fef5fcc1ad9ee981c283d9c7321aaaf983998da63808ea85414cc4d
MD5 777735967f67be7d15755739ea63714d
BLAKE2b-256 80d7be56ce5c07e8748e53e08141e88a3ba61d6d170b22977b8f6c6259020964

See more details on using hashes here.

Provenance

The following attestation bundles were made for bam2cell-0.2-py3-none-any.whl:

Publisher: release.yml on davidrm-bio/bam2cell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page