A method to split BAM files for celltypes
Project description
Bam2cell
A package to split a BAM file based on cell type annotation in AnnData objects.
Usage and Examples
There are two modes sequential and parallel. The sequential mode will process cell types one by one but is more disk space friendly, the parallel is more disk space hungry but faster, since it process all cell types at the same time.
A minimal example is shown here:
⚠️ Note: The barcodes should not contain suffix or prefix. Use `clean_bcs()` to remove them.
import bam2cell
import anndata as ad
adata = ad.read_h5ad("data/adata.h5ad")
generator = bam2cell.GenerateCellTypeBAM(adata,
annot_key="annotation",
output_path="data/",
input_bam="data/AllCellsSorted_toy.bam",
tmp_path="data/",
workers=8,
)
generator.process_all_parallel() # Case 1 - Process all cell types at the same time
generator.process_cts_sequential() # Case 2 - Process cell types one by one
For a more advanced usage, you can use the function bam2cell, which allow to process an AnnData with multiple samples.
import bam2cell
import anndata as ad
import pandas as pd
adata = ad.read_h5ad("data/adata.h5ad")
artificial_batch = ["batch1"] * 100 + ["batch2"] * 91
adata.obs["batch"] = pd.Categorical(artificial_batch)
adata.obs["bam_path"] = "data/AllCellsSorted_toy.bam"
bam2cell.bam2cell(adata,
annot_key="annotation",
input_bam=None, # Only when we have 1 batch in the AnnData
output_path="data/",
tmp_path="data/",
bam_key="bam_path", # For each barcode we have the path to the BAM file
batch_key="batch",
mode="parallel",
suffix=None, # Suffix in the barcode to be removed (e.g., BC-1-suffix --> BC-1)
prefix=None, # Prefix in the BC to be removed (e.g., prefix-BC-1 --> BC-1)
workers=8
)
Installation
You need to have Python 3.10 or newer installed on your system. There are several alternative options
to install bam2cell:
- Install the latest release of
bam2cellfrom PyPI:
pip install bam2cell
- Install the latest development version:
pip install git+https://github.com/davidrm-bio/bam2cell.git@main
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bam2cell-0.2.tar.gz.
File metadata
- Download URL: bam2cell-0.2.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d46b1bb779dafd751312556ccc2b9f7d6d73e4231b9872424e0c084e2af3751
|
|
| MD5 |
0772a20bc2c6791502ac84c885793486
|
|
| BLAKE2b-256 |
f6d9cba5001c48dea2a5d01a0ffa601ce303c9047ba6949b2df2c36f5474518d
|
Provenance
The following attestation bundles were made for bam2cell-0.2.tar.gz:
Publisher:
release.yml on davidrm-bio/bam2cell
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bam2cell-0.2.tar.gz -
Subject digest:
7d46b1bb779dafd751312556ccc2b9f7d6d73e4231b9872424e0c084e2af3751 - Sigstore transparency entry: 421632495
- Sigstore integration time:
-
Permalink:
davidrm-bio/bam2cell@a2d41fee6c6c437b61621271109a1c1401586797 -
Branch / Tag:
refs/tags/v0.2 - Owner: https://github.com/davidrm-bio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a2d41fee6c6c437b61621271109a1c1401586797 -
Trigger Event:
release
-
Statement type:
File details
Details for the file bam2cell-0.2-py3-none-any.whl.
File metadata
- Download URL: bam2cell-0.2-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd266d013fef5fcc1ad9ee981c283d9c7321aaaf983998da63808ea85414cc4d
|
|
| MD5 |
777735967f67be7d15755739ea63714d
|
|
| BLAKE2b-256 |
80d7be56ce5c07e8748e53e08141e88a3ba61d6d170b22977b8f6c6259020964
|
Provenance
The following attestation bundles were made for bam2cell-0.2-py3-none-any.whl:
Publisher:
release.yml on davidrm-bio/bam2cell
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bam2cell-0.2-py3-none-any.whl -
Subject digest:
fd266d013fef5fcc1ad9ee981c283d9c7321aaaf983998da63808ea85414cc4d - Sigstore transparency entry: 421632504
- Sigstore integration time:
-
Permalink:
davidrm-bio/bam2cell@a2d41fee6c6c437b61621271109a1c1401586797 -
Branch / Tag:
refs/tags/v0.2 - Owner: https://github.com/davidrm-bio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a2d41fee6c6c437b61621271109a1c1401586797 -
Trigger Event:
release
-
Statement type: