A Python package for perturbational single-cell data analysis
Project description
pyturbseq
Author: Aidan Winters Email: aidanw@arcinstitute.org
Description
pyturbseq is a comprehensive Python package for processing and analysis of single-cell perturbation data, with a particular focus on CRISPR-based perturbation screens. The package provides tools for:
- Perturbation calling and quality control: Identify and validate perturbations in single cells
- Differential expression analysis: Compare gene expression between perturbed and control cells
- Interaction analysis: Detect and quantify genetic interactions in dual perturbation experiments
- Visualization: Generate publication-ready plots for perturbation screen analysis
- Data processing: Handle large-scale single-cell datasets with efficient algorithms
Key Features
- Multi-perturbation support: Handle single and dual perturbation experiments
- Flexible data input: Compatible with standard single-cell formats (AnnData, h5ad)
- Statistical analysis: Robust statistical methods for differential expression and interaction detection
- Scalable processing: Efficient algorithms for large datasets
- Comprehensive visualization: Rich plotting functions for data exploration and publication
Installation
Requirements
- Python 3.9 or higher
- Tested extensively with Python 3.9-3.13
Install from PyPI (recommended)
pip install pyturbseq
Install from source
git clone https://github.com/aidanwinters/pyturbseq.git
cd pyturbseq
pip install -e .
Development installation
For development with testing and linting tools:
pip install -e ".[dev,test]"
Quick Start
import pyturbseq as prtb
import scanpy as sc
# Load your single-cell perturbation data
adata = sc.read_h5ad("your_perturbation_data.h5ad")
# Generate perturbation matrix
prtb.utils.get_perturbation_matrix(adata, perturbation_col='feature_call')
# Calculate target gene changes
prtb.utils.calculate_target_change(adata, perturbation_column='feature_call')
# Perform differential expression analysis
# Single comparison: compare specific perturbation vs control
deg_results = prtb.de.get_degs(adata, design_col='feature_call', control_value='NTC')
# Multiple comparisons: test all perturbations vs control
all_deg_results = prtb.de.get_all_degs(adata, design_col='feature_call', control_value='NTC')
# Analyze genetic interactions (for dual perturbation data)
# Single interaction: analyze specific dual perturbation
result, prediction = prtb.interaction.norman_model(data, 'GENE1|GENE2')
# Multiple interactions: analyze specific dual perturbations
interaction_results = prtb.interaction.norman_model(data, dual_perturbation_list)
# Auto-detect and analyze all dual perturbations (default behavior)
interaction_results = prtb.interaction.norman_model(data)
# Parallel processing for large datasets
interaction_results = prtb.interaction.norman_model(data, parallel=True, processes=8)
# Generate visualizations
prtb.plot.target_change_heatmap(adata, perturbation_column='feature_call')
Main Modules
utils: Core utilities for data processing and perturbation matrix generationde: Differential expression analysis toolsinteraction: Genetic interaction analysis for dual perturbation experimentsplot: Visualization functions for perturbation screen datacalling: Perturbation calling and quality controlcellranger: Integration with Cell Ranger outputs
Documentation
For detailed documentation and tutorials, see the included Tutorial.ipynb notebook which demonstrates:
- Data loading and preprocessing
- Perturbation calling
- Differential expression analysis
- Interaction analysis
- Visualization workflows
Citation
If you use pyturbseq in your research, please cite:
Winters, A. (2024). pyturbseq: A Python package for perturbational single-cell data analysis.
Support
For questions and support:
- Open an issue on GitHub
- Email: aidanw@arcinstitute.org
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyturbseq-0.1.0.tar.gz.
File metadata
- Download URL: pyturbseq-0.1.0.tar.gz
- Upload date:
- Size: 50.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d741194655562cfe16ef1e1b7dff895af6998fc39abacabae38c32cf5fc3a6ba
|
|
| MD5 |
3ab0b6149aa70fe7149720d1af4ae27f
|
|
| BLAKE2b-256 |
81c87d5fda20c67dfcfcb004ee5788b8b77a84b7e96caa6719011e130922aa4d
|
File details
Details for the file pyturbseq-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyturbseq-0.1.0-py3-none-any.whl
- Upload date:
- Size: 43.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ddda15d5b86900f5d9de0395a8770a6302668cd619b72eaf9bf6ce284733fc0f
|
|
| MD5 |
812532061f8e23809773002720cd20fe
|
|
| BLAKE2b-256 |
9171e6e1ba0693d463368a50411b5aeaed9f760d2586139aac880fa6a091342c
|