Skip to main content

A Narwhals powered DataFrame-style selection, filtering and indexing operations on AnnData Objects.

Project description

annsel

Status Build Tests Documentation codecov pre-commit
Meta Hatch project Ruff uv License gitmoji
Package PyPI PyPI
Ecosystem scverse

Annsel is a user-friendly library that brings familiar dataframe-style operations to AnnData objects.

It's built on the narwhals compatibility layer for dataframes.

Take a look at the GitHub Projects board for features and future plans: Annsel Features

Getting started

Please refer to the documentation, in particular, the API documentation.

There's also a brief tutorial on how to use all the features of annsel: All of Annsel.

Installation

You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv. There are several ways to install annsel:

  1. Install the most recent release:

    With uv:

    uv add annsel
    

    With pip:

    pip install annsel
    
  2. Install the latest development version:

    With uv:

    uv add git+https://github.com/srivarra/annsel
    

    With pip:

    pip install git+https://github.com/srivarra/annsel.git@main
    

Examples

annsel comes with a small dataset from Cell X Gene to help you get familiar with the API.

import annsel as an

adata = an.datasets.leukemic_bone_marrow_dataset()

The dataset looks like this:

AnnData object with n_obs × n_vars = 31586 × 458
    obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
    var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
    uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
    obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Filter

You can filter on obs, var, var_names, obs_names, X and it's layers, as well as obsm and varm matrices as a key-value pair containing the attribute's key name and the predicate to filter on. Currently the column names are numerical indices for obsm and varm matrices.

adata.an.filter(
    obs=(
        an.col(["Cell_label"]).is_in(["Classical Monocytes", "CD8+CD103+ tissue resident memory T cells"]),
        an.col(["sex"]) == "male",
    ),
    var=an.col(["vst.mean"]) >= 3,
    obsm={"X_pca": an.col([0]) > 0}, # PC1 values greater than 0
    copy=False, # Whether to return a copy of the AnnData object or just a view of it.
)
View of AnnData object with n_obs × n_vars = 736 × 67
    obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
    var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
    uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
    obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Select

You can select on obs, var, var_names, obs_names, X and it's layers. Selecting returns a new AnnData object. It's useful if you don't need all the columns in obs or var and just want to work with a few.

adata.an.select(
    obs=an.col(["Cell_label"]),
    var=an.col(["vst.mean", "vst.std"]),
)

Group By

You can group over obs and var columns which returns a generator of objects containing the grouped data and the grouping parameters.

gb_adata_result = adata.an.group_by(
    obs=an.col(["Cell_label"]),
    var=an.col(["feature_type"]),
    copy=False,
)

Here's what the first group looks like:

next(adata.an.group_by(
    obs=an.col(["Cell_label"]),
    copy=False,
))
GroupByAnnData:
  ├── Observations:
     └── Cell_label: Lymphomyeloid prog
  ├── Variables:
     └── (all variables)
  └── AnnData:
      View of AnnData object with n_obs × n_vars = 913 × 458
          obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
          var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
          uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
          obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Pipe

There's also a small utility method which allows you to chain operations together like in Xarray and Pandas called pipe.

import scanpy as sc
adata.an.pipe(sc.pl.embedding, basis="X_tsneni", color="Cell_label")

Release notes

See the changelog.

Contact

For questions and help requests, you can reach out in the scverse discourse or the discussions tab. If you found a bug, please use the issue tracker.

Citation

Varra, S. R. annsel [Computer software]. https://github.com/srivarra/annsel

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

annsel-0.1.2.tar.gz (342.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

annsel-0.1.2-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file annsel-0.1.2.tar.gz.

File metadata

  • Download URL: annsel-0.1.2.tar.gz
  • Upload date:
  • Size: 342.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for annsel-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a9ab14495f3e28c216a5cb74f918542710ac82e22bb04db57af233bbbf536da8
MD5 2dd1c710ea0f56b3faba255ffbd53b79
BLAKE2b-256 4893cdbe4903acb43fd743832e0c2933b88ec2dfd2a87b147992c45b90f647c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for annsel-0.1.2.tar.gz:

Publisher: release.yaml on srivarra/annsel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file annsel-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: annsel-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for annsel-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9659f60f4ee5545bc030341b6b8e932e0a615b95ac83d9cb6e2bd50951ae26da
MD5 6f8b3cfadda98d32ef494d4bda6e7a3a
BLAKE2b-256 cb43fff37380b704b60fad17c298b631044d98e3db8b9721df7897c967facf51

See more details on using hashes here.

Provenance

The following attestation bundles were made for annsel-0.1.2-py3-none-any.whl:

Publisher: release.yaml on srivarra/annsel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page