Skip to main content

A method to split BAM files for celltypes

Project description

Bam2cell

Latest PyPI Version PyPI Downloads tests

A package to split a BAM file based on cell type annotation in AnnData objects.

Usage and Examples

There are two modes sequential and parallel. The sequential mode will process cell types one by one but is more disk space friendly, the parallel is more disk space hungry but faster, since it process all cell types at the same time.

A minimal example is shown here:

import bam2cell
import anndata as ad

adata = ad.read_h5ad("data/adata.h5ad")

generator = bam2cell.GenerateCellTypeBAM(adata, 
                                         annot_key="annotation",
                                         output_path="data/",
                                         input_bam="data/AllCellsSorted_toy.bam",
                                         tmp_path="data/",
                                         workers=8,
                                         )
generator.process_all_parallel()  # Case 1 - Process all cell types at the same time
generator.process_cts_sequential() # Case 2 - Process cell types one by one

For a more advanced usage, you can use the function bam2cell, which allow to process an AnnData with multiple samples.

import bam2cell
import anndata as ad
import pandas as pd

adata = ad.read_h5ad("data/adata.h5ad")
artificial_batch = ["batch1"] * 100 + ["batch2"] * 91
adata.obs["batch"] = pd.Categorical(artificial_batch)
adata.obs["bam_path"] = "data/AllCellsSorted_toy.bam"

bam2cell.bam2cell(adata,
                  annot_key="annotation",
                  input_bam=None,  # Only when we have 1 batch in the AnnData
                  output_path="data/",  
                  tmp_path="data/",
                  bam_key="bam_path",  # For each barcode we have the path to the BAM file
                  batch_key="batch",  
                  mode="parallel",
                  suffix=None,  # Suffix in the barcode to be removed (e.g., BC-1-suffix --> BC-1)
                  prefix=None,  # Prefix in the BC to be removed (e.g., prefix-BC-1 --> BC-1) 
                  workers=8
                  )

Installation

You need to have Python 3.10 or newer installed on your system. There are several alternative options to install bam2cell:

  1. Install the latest release of bam2cell from PyPI:
pip install bam2cell  
  1. Install the latest development version:
pip install git+https://github.com/davidrm-bio/bam2cell.git@main

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bam2cell-0.1.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bam2cell-0.1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file bam2cell-0.1.tar.gz.

File metadata

  • Download URL: bam2cell-0.1.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bam2cell-0.1.tar.gz
Algorithm Hash digest
SHA256 88315344d3acd86e04b9f4c76478c9b425a46a23505e8cd6454fd58aabb5c714
MD5 3f9fbd8798ca8d0901b9a1d0c917ac6a
BLAKE2b-256 5d03c932fab68781226060c5b828767a08c7d51674cf128035140d8ccb2a6784

See more details on using hashes here.

Provenance

The following attestation bundles were made for bam2cell-0.1.tar.gz:

Publisher: release.yml on davidrm-bio/bam2cell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bam2cell-0.1-py3-none-any.whl.

File metadata

  • Download URL: bam2cell-0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bam2cell-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b40010858b3c38785c7e60fa8f4a6614cd7913ea08aa6785da3e6d18b8bdad35
MD5 08ef84c32476e24ce277690bc3d7d0ff
BLAKE2b-256 d23fbf1a6406816f3c7214d78e674d06bc7d19f21c38c9966c9f3037a72dc0e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for bam2cell-0.1-py3-none-any.whl:

Publisher: release.yml on davidrm-bio/bam2cell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page