Skip to main content

A powerful, multi-interface Python package for deep characterization of single-cell feature expression patterns.

Project description

Single-Cell Feature Profiler

PyPI version License: MIT Python versions

A powerful, fast, and multi-interface Python package for deep characterization of single-cell feature expression patterns.

scfeatureprofiler provides a suite of statistical tools to analyze single-cell data (e.g., scRNA-seq, CITE-seq) and answer two fundamental questions:

  1. Feature Activity: In which cell groups is a feature actively expressed?
  2. Marker Discovery: Which features are specific markers for each cell group?

The package is designed for performance, with a parallelized backend that can handle extremely large datasets, including out-of-core analysis for data that doesn't fit into memory.

Key Features

  • Multi-Interface: Use it as a Python library in your Jupyter notebooks or as a command-line tool for script-based workflows.
  • Flexible Input: Works directly with AnnData objects, pandas.DataFrame, or numpy arrays.
  • Comprehensive Statistics: Calculates normalized expression scores, percentage of expressing cells, specificity scores (Tau and Gini), and robust statistical significance (FDR).
  • High Performance: Parallelized using joblib to use all available CPU cores for rapid analysis.
  • Scalable: Natively supports out-of-core computation for AnnData objects stored on disk, enabling analysis of millions of cells.
  • Lightweight: Minimal dependencies, making it easy to integrate into existing analysis environments.

Installation

You can install scfeatureprofiler directly from PyPI:

pip install scfeatureprofiler

To include support for AnnData objects, install with the [anndata] extra:

pip install scfeatureprofiler[anndata]

To install all dependencies for development, use:

# Clone the repository first
git clone https://github.com/zqzneptune/SingleCellFeatureProfiler.git
cd SingleCellFeatureProfiler
pip install -e ".[all]"

Quick Start

scfeatureprofiler is designed to be intuitive. Here are two examples for the most common use cases.

1. Python API: Find Marker Genes for Clusters

This is the most common use case inside a Jupyter notebook after you have performed clustering.

import scanpy as sc
from scfeatureprofiler import find_marker_features

# 1. Load your clustered single-cell data
#    (This example assumes you have an AnnData object)
adata = sc.read_h5ad("path/to/your_clustered_data.h5ad")

# 2. Find marker features for your clusters
#    'leiden' is the column in adata.obs containing cluster labels.
marker_dict = find_marker_features(
    data=adata,
    group_by='leiden'
)

# 3. Print the results
for cluster, markers in marker_dict.items():
    print(f"Cluster {cluster} Markers: {', '.join(markers[:10])}...")

# 4. (Optional) Use the results directly with Scanpy for plotting
import scanpy as sc
sc.pl.dotplot(adata, marker_dict, groupby='leiden')

2. Command-Line (CLI): Get Full Profiles for a Gene List

If you have a CSV file of expression data and want to get a detailed statistical report for a few genes of interest without writing a script.

Input Files:

  • expression.csv: A cells-by-genes matrix.
  • cell_groups.csv: A file mapping cell IDs to group labels.

Command:

scprofiler profile \
    --input example/expression.csv \
    --group-by example/cell_groups.csv \
    --features "CD4,CD8A,GNLY,MS4A1" \
    --output gene_profiles.csv

Output (gene_profiles.csv): This will produce a detailed CSV file with statistics for each gene in each cell group, ready for analysis in Excel or another program.

feature_id,group,norm_score,pct_expressing,fdr_presence,fdr_marker,...
CD4,T-cell Helper,0.98,85.4,1.2e-50,3.4e-30,...
CD4,B-cell,0.05,2.1,0.89,1.0,...
CD8A,T-cell Cytoxic,0.99,92.1,4.5e-60,8.1e-45,...
...

Available CLI Commands

  • scprofiler profile: Generate a full statistical profile for features.
  • scprofiler activity: Identify in which groups a list of features are "ON".

Use scprofiler --help or scprofiler profile --help for a full list of options.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scfeatureprofiler-1.1.1.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scfeatureprofiler-1.1.1-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file scfeatureprofiler-1.1.1.tar.gz.

File metadata

  • Download URL: scfeatureprofiler-1.1.1.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for scfeatureprofiler-1.1.1.tar.gz
Algorithm Hash digest
SHA256 4707007ca8600c133c431c13d4cafd4e64881f9d21c57e443b1e87751f71d9a8
MD5 7a2e4c289869aa5071f2fe7f809f181f
BLAKE2b-256 f026a1e97e7fc8cc909d2a7d21beb873d8d4dbba0264f8b7036761cf20fac493

See more details on using hashes here.

File details

Details for the file scfeatureprofiler-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for scfeatureprofiler-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 21e7a699022e7d8b09ea39363a43bb2f4d0054902c8931f0bdda1dfa66edcdaf
MD5 df2a120f3d915a30e3ce03c9836b687f
BLAKE2b-256 e8bccdff0faaea58d80874c0081735e7bdebf82998f9a6473b5b07ca810554f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page