A tool to assign identifiers to cell barcodes
Project description
geomux
A tool that assigns guides to cell barcodes.
Uses a hypergeometric distribution to calculate the pvalue of observing the specific count of a guide for each guide in each barcode. This can be used to calculate the MOI of the cell and assigned guides for each cell. The resulting dataframe can then be used to intersect with your original data to assign every cell to a barcode and allows you to filter for the MOI you're interested in working with.
Installation
geomux is distributed via uv
uv tool install geomux
geomux --help
Usage
Geomux can be used either as a commandline tool or as a python module
Geomux supports two modes of operation:
- Hypergeometric testing
- Gaussian Mixture Model testing
This can be set with the --method flag on the CLI or by using the relevant function (geomux or gaussian_mixture)
Commandline
when installing via uv, an executable will be placed in your bin path. So you can call it directly from wherever in your filesystem
# example usage
geomux <input.tab / input.h5ad>
You can also run the help flag to see the help menu for parameter options.
Usage: geomux [OPTIONS] INPUT [OUTPUT]
╭─ Arguments ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * input TEXT Input file path (tsv/h5ad) to assign guides. [required] │
│ output [OUTPUT] Output file path (tsv) to save assignments. [default: geomux.tsv] │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --min-umi-cells INTEGER Minimum UMI count to consider a barcode [default: 5] │
│ --min-umi-guides INTEGER Minimum number of barcodes to consider a guide [default: 5] │
│ --fdr-threshold FLOAT Maximum pvalue (fdr) to consider a guide-assignment [default: 0.05] │
│ --lor-threshold FLOAT Log odds ratio threshold to use (None for adaptive thresholding) │
│ --adaptive-lor-scalar FLOAT Scalar to adaptively set log odds ratio threshold │
│ --subtract --no-subtract Subtract 1 from counts before testing. [default: subtract] │
│ --stats TEXT Output file to write assignment statistics to as json │
│ --method TEXT Method to use for assignment (geomux/mixture) [default: geomux] │
│ --n-jobs INTEGER Number of jobs to use for parallel processing (mixture model only). -1 for all available cores. │
│ [default: -1] │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Python Module
Processing an h5ad file format
import anndata as ad
from geomux import geomux, gaussian_mixture
input = "filename.h5ad"
adata = ad.read_h5ad(input)
assignments_geomux = geomux(adata)
print(assignments_geomux)
assignments_mixture = gaussian_mixture(adata)
print(assignments_mixture)
Processing an gene x guide sparse matrix
import anndata as ad
from geomux import geomux
input = "filename.h5ad"
adata = ad.read_h5ad(input)
matrix = adata.X.tocsr()
assignments = geomux(matrix)
print(assignments)
Outputs
The results of geomux will be an assignment dataframe that has as many
observations as there are input cells.
The columns of this dataframe will include:
| Column Name | Description |
|---|---|
| cell_id | The numerical index of this cell in the count matrix. |
| submatrix_id | The numerical index of this cell in the filtered count matrix. |
| cell | The numerical index of this cell or the name of the cell if provided. |
| moi | The number of assigned guides for this cell. |
| n_umi | The number of total UMIs observed in the cell. |
| assignment | A '|' separated string of the assigned guides for this cell. |
| guide_ids_original | A '|' separated string of the assigned guide numerical indices. |
| umis | A '|' separated string of the assigned guide UMIs. |
| fdr | A '|' separated string of the false discovery rate of each assignment. |
| log_odds | A '|' separated string of the log-odds of each assignment. |
| tested | A bool designating whether this cell met the testing criteria. |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file geomux-0.5.4.tar.gz.
File metadata
- Download URL: geomux-0.5.4.tar.gz
- Upload date:
- Size: 12.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59cd11e22a83085e857cccfde36ad09e0e496065ee971df8e2331d1d245d01e5
|
|
| MD5 |
00e5da4d7559eb8d94c8ac09d40b3043
|
|
| BLAKE2b-256 |
f0ea13acd73d67446b6bfacfdc22b0ac6169121a277bcbe1f357b494cd7b7a15
|
File details
Details for the file geomux-0.5.4-py3-none-any.whl.
File metadata
- Download URL: geomux-0.5.4-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b21f1a49f24aed09ac49c09a95a91aa0a43e3dd7778a9785d99c2d462b423cb7
|
|
| MD5 |
6c39ac5a1053837a48979a43e6876e3e
|
|
| BLAKE2b-256 |
bf1633d269486e60234f5e065d7e62893132388a46af91fb28a709d947c5d0c3
|