Skip to main content

A Python package for perturbational single-cell data analysis

Project description

pyturbseq

Python 3.9+ License: MIT

Author: Aidan Winters Email: aidanw@arcinstitute.org

Description

pyturbseq is a comprehensive Python package for processing and analysis of single-cell perturbation data, with a particular focus on CRISPR-based perturbation screens. The package provides tools for:

  • Perturbation calling and quality control: Identify and validate perturbations in single cells
  • Differential expression analysis: Compare gene expression between perturbed and control cells
  • Interaction analysis: Detect and quantify genetic interactions in dual perturbation experiments
  • Visualization: Generate publication-ready plots for perturbation screen analysis
  • Data processing: Handle large-scale single-cell datasets with efficient algorithms

Key Features

  • Multi-perturbation support: Handle single and dual perturbation experiments
  • Flexible data input: Compatible with standard single-cell formats (AnnData, h5ad)
  • Statistical analysis: Robust statistical methods for differential expression and interaction detection
  • Scalable processing: Efficient algorithms for large datasets
  • Comprehensive visualization: Rich plotting functions for data exploration and publication

Installation

Requirements

  • Python 3.9 or higher
  • Tested extensively with Python 3.9-3.13

Install from PyPI (recommended)

pip install pyturbseq

Install from source

git clone https://github.com/aidanwinters/pyturbseq.git
cd pyturbseq
pip install -e .

Development installation

For development with testing and linting tools:

pip install -e ".[dev,test]"

Quick Start

import pyturbseq as prtb
import scanpy as sc

# Load your single-cell perturbation data
adata = sc.read_h5ad("your_perturbation_data.h5ad")

# Generate perturbation matrix
prtb.utils.get_perturbation_matrix(adata, perturbation_col='feature_call')

# Calculate target gene changes
prtb.utils.calculate_target_change(adata, perturbation_column='feature_call')

# Perform differential expression analysis
# Single comparison: compare specific perturbation vs control
deg_results = prtb.de.get_degs(adata, design_col='feature_call', control_value='NTC')

# Multiple comparisons: test all perturbations vs control
all_deg_results = prtb.de.get_all_degs(adata, design_col='feature_call', control_value='NTC')

# Analyze genetic interactions (for dual perturbation data)
# Single interaction: analyze specific dual perturbation
result, prediction = prtb.interaction.norman_model(data, 'GENE1|GENE2')

# Multiple interactions: analyze specific dual perturbations
interaction_results = prtb.interaction.norman_model(data, dual_perturbation_list)

# Auto-detect and analyze all dual perturbations (default behavior)
interaction_results = prtb.interaction.norman_model(data)

# Parallel processing for large datasets
interaction_results = prtb.interaction.norman_model(data, parallel=True, processes=8)

# Generate visualizations
prtb.plot.target_change_heatmap(adata, perturbation_column='feature_call')

Main Modules

  • utils: Core utilities for data processing and perturbation matrix generation
  • de: Differential expression analysis tools
  • interaction: Genetic interaction analysis for dual perturbation experiments
  • plot: Visualization functions for perturbation screen data
  • calling: Perturbation calling and quality control
  • cellranger: Integration with Cell Ranger outputs

Documentation

For detailed documentation and tutorials, see the included Tutorial.ipynb notebook which demonstrates:

  • Data loading and preprocessing
  • Perturbation calling
  • Differential expression analysis
  • Interaction analysis
  • Visualization workflows

Citation

If you use pyturbseq in your research, please cite:

Winters, A. (2024). pyturbseq: A Python package for perturbational single-cell data analysis.

Support

For questions and support:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyturbseq-0.1.0.tar.gz (50.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyturbseq-0.1.0-py3-none-any.whl (43.1 kB view details)

Uploaded Python 3

File details

Details for the file pyturbseq-0.1.0.tar.gz.

File metadata

  • Download URL: pyturbseq-0.1.0.tar.gz
  • Upload date:
  • Size: 50.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for pyturbseq-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d741194655562cfe16ef1e1b7dff895af6998fc39abacabae38c32cf5fc3a6ba
MD5 3ab0b6149aa70fe7149720d1af4ae27f
BLAKE2b-256 81c87d5fda20c67dfcfcb004ee5788b8b77a84b7e96caa6719011e130922aa4d

See more details on using hashes here.

File details

Details for the file pyturbseq-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyturbseq-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 43.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for pyturbseq-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ddda15d5b86900f5d9de0395a8770a6302668cd619b72eaf9bf6ce284733fc0f
MD5 812532061f8e23809773002720cd20fe
BLAKE2b-256 9171e6e1ba0693d463368a50411b5aeaed9f760d2586139aac880fa6a091342c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page