A Narwhals powered DataFrame-style selection, filtering and indexing operations on AnnData Objects.
Project description
annsel
Annsel is a user-friendly library that brings familiar dataframe-style operations to AnnData objects.
It's built on the narwhals compatibility layer for dataframes.
Take a look at the GitHub Projects board for features and future plans: Annsel Features
Getting started
Please refer to the documentation, in particular, the API documentation.
There's also a brief tutorial on how to use all the features of annsel: All of Annsel.
Installation
You need to have Python 3.10 or newer installed on your system. If you don't have
Python installed, we recommend installing uv.
There are several ways to install annsel:
-
Install the most recent release:
With
uv:uv add annsel
With
pip:pip install annsel
-
Install the latest development version:
With
uv:uv add git+https://github.com/srivarra/annsel
With
pip:pip install git+https://github.com/srivarra/annsel.git@main
Examples
annsel comes with a small dataset from Cell X Gene to help you get familiar with the API.
import annsel as an
adata = an.datasets.leukemic_bone_marrow_dataset()
The dataset looks like this:
AnnData object with n_obs × n_vars = 31586 × 458
obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'
Filter
You can filter on obs, var, var_names, obs_names, X and it's layers, as well as obsm and varm matrices as a key-value pair containing the attribute's key name and the predicate to filter on. Currently the column names are numerical indices for obsm and varm matrices.
adata.an.filter(
obs=(
an.col(["Cell_label"]).is_in(["Classical Monocytes", "CD8+CD103+ tissue resident memory T cells"]),
an.col(["sex"]) == "male",
),
var=an.col(["vst.mean"]) >= 3,
obsm={"X_pca": an.col([0]) > 0}, # PC1 values greater than 0
copy=False, # Whether to return a copy of the AnnData object or just a view of it.
)
View of AnnData object with n_obs × n_vars = 736 × 67
obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'
Select
You can select on obs, var, var_names, obs_names, X and it's layers. Selecting returns a new AnnData object. It's useful if you don't need all the columns in obs or var and just want to work with a few.
adata.an.select(
obs=an.col(["Cell_label"]),
var=an.col(["vst.mean", "vst.std"]),
)
Group By
You can group over obs and var columns which returns a generator of objects containing the grouped data and the grouping parameters.
gb_adata_result = adata.an.group_by(
obs=an.col(["Cell_label"]),
var=an.col(["feature_type"]),
copy=False,
)
Here's what the first group looks like:
next(adata.an.group_by(
obs=an.col(["Cell_label"]),
copy=False,
))
GroupByAnnData:
├── Observations:
│ └── Cell_label: Lymphomyeloid prog
├── Variables:
│ └── (all variables)
└── AnnData:
View of AnnData object with n_obs × n_vars = 913 × 458
obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'
Pipe
There's also a small utility method which allows you to chain operations together like in Xarray and Pandas called pipe.
import scanpy as sc
adata.an.pipe(sc.pl.embedding, basis="X_tsneni", color="Cell_label")
Release notes
See the changelog.
Contact
For questions and help requests, you can reach out in the scverse discourse or the discussions tab. If you found a bug, please use the issue tracker.
Citation
Varra, S. R. annsel [Computer software]. https://github.com/srivarra/annsel
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file annsel-0.1.2.tar.gz.
File metadata
- Download URL: annsel-0.1.2.tar.gz
- Upload date:
- Size: 342.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9ab14495f3e28c216a5cb74f918542710ac82e22bb04db57af233bbbf536da8
|
|
| MD5 |
2dd1c710ea0f56b3faba255ffbd53b79
|
|
| BLAKE2b-256 |
4893cdbe4903acb43fd743832e0c2933b88ec2dfd2a87b147992c45b90f647c8
|
Provenance
The following attestation bundles were made for annsel-0.1.2.tar.gz:
Publisher:
release.yaml on srivarra/annsel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
annsel-0.1.2.tar.gz -
Subject digest:
a9ab14495f3e28c216a5cb74f918542710ac82e22bb04db57af233bbbf536da8 - Sigstore transparency entry: 797249354
- Sigstore integration time:
-
Permalink:
srivarra/annsel@448ddfd8ed15b2adf7b5de00c774539e90f9b482 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/srivarra
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@448ddfd8ed15b2adf7b5de00c774539e90f9b482 -
Trigger Event:
release
-
Statement type:
File details
Details for the file annsel-0.1.2-py3-none-any.whl.
File metadata
- Download URL: annsel-0.1.2-py3-none-any.whl
- Upload date:
- Size: 19.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9659f60f4ee5545bc030341b6b8e932e0a615b95ac83d9cb6e2bd50951ae26da
|
|
| MD5 |
6f8b3cfadda98d32ef494d4bda6e7a3a
|
|
| BLAKE2b-256 |
cb43fff37380b704b60fad17c298b631044d98e3db8b9721df7897c967facf51
|
Provenance
The following attestation bundles were made for annsel-0.1.2-py3-none-any.whl:
Publisher:
release.yaml on srivarra/annsel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
annsel-0.1.2-py3-none-any.whl -
Subject digest:
9659f60f4ee5545bc030341b6b8e932e0a615b95ac83d9cb6e2bd50951ae26da - Sigstore transparency entry: 797249386
- Sigstore integration time:
-
Permalink:
srivarra/annsel@448ddfd8ed15b2adf7b5de00c774539e90f9b482 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/srivarra
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@448ddfd8ed15b2adf7b5de00c774539e90f9b482 -
Trigger Event:
release
-
Statement type: