A Python package for simulating cellular plasticity in single-cell data
Project description
PLASTRO
PLASTRO is a Python package for simulating and analyzing cellular plasticity in single-cell data. It provides comprehensive tools for studying how cells transition between different phenotypic states and how these transitions relate to lineage relationships.
Key Features
- Plasticity Simulation: Random walk plasticity and cluster-based transitions
- Lineage Tracing Integration: CRISPR-based lineage tracing simulation with Cassiopeia
- PLASTRO Score: Novel overlap-based metrics for quantifying cellular plasticity.
- Phylogenetic Analysis: Neighbor-joining tree construction from single-cell data
- Data Simulation: Generate realistic synthetic datasets with branching differentiation
- High Performance: Optimized overlap computation (10-100x speedup over naive methods)
Installation
Quick Install
PLASTRO requires pybind11 to be installed first for building essential dependencies:
pip install pybind11
pip install plastro
From TestPyPI (Latest Development Version)
pip install pybind11
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ plastro
Development Install
git clone https://github.com/dpeerlab/PLASTRO.git
cd PLASTRO
pip install pybind11
pip install -e .
Conda Install (coming soon)
conda install -c conda-forge plastro
Quick Start
Basic PLASTRO Score Computation
import plastro
import pandas as pd
# Load your single-cell data and lineage tracing data
character_matrix = pd.read_csv('character_matrix.csv', index_col=0)
adata = plastro.load_data('single_cell_data.h5ad')
# Compute Gini-based plasticity scores (recommended)
gini_scores = plastro.PLASTRO_score(
character_matrix=character_matrix,
ad=adata,
flavor='gini',
latent_space_key='X_dc' # or 'X_pca', 'X_umap'
)
print(f"Mean Gini plasticity score: {gini_scores['Gini_Index'].mean():.3f}")
Generate Synthetic Data with Plasticity
# Create synthetic single-cell dataset
n_leaves = 8
sample_res = 50
n_dim = 20
# Generate branching structure
sample_structure = plastro.create_random_binary_tree(n_leaves, sample_res)
full_simulated_ad = plastro.generate_ad(sample_structure, n_dim)
# Subset to terminal branches
ad = plastro.subset_to_terminal_branches(full_simulated_ad)
# Simulate lineage tracing
cass_tree = plastro.simulate_lineage_tracing(
sim_ad=full_simulated_ad,
terminal_ad=ad,
latent_space_key='X_dc'
)
Simulate Cellular Plasticity
# Random walk plasticity
plastic_cells = {'6': 0.3, '5': 0.2} # 30% of cluster 6, 20% of cluster 5
walk_lengths = {'6': 500, '5': 1000}
plastic_walk_ad = plastro.random_walk_plasticity(
full_simulated_ad, ad, plastic_cells, walk_lengths
)
# Cluster switch plasticity
destination_clusters = {
'11': {'destination': '7', 'proportion': 0.4},
'6': {'destination': '10', 'proportion': 0.2}
}
plastic_leiden_ad = plastro.cluster_switch_plasticity(
full_simulated_ad, ad, destination_clusters, column='leiden'
)
Documentation & Examples
Example Notebooks
Complete example workflows are available in docs/notebooks/:
-
Plasticity Simulation Example:
- Generate synthetic single-cell data with branching differentiation
- Simulate CRISPR-based lineage tracing
- Apply random walk and cluster switch plasticity
- Visualize phenotypic changes
-
- In-depth analysis of lineage-phenotype relationships
- Detailed explanation of overlap computation methods
- Interpretation of PLASTRO scores
Key Documentation Sections
- Installation Guide: Detailed installation instructions
- API Reference: Complete function documentation
- Tutorials: Step-by-step guides
Core API
Main Functions
# PLASTRO Score Computation
plastro.PLASTRO_score(character_matrix, ad, flavor='gini')
plastro.PLASTRO_overlaps(character_matrix, ad, maximum_radius=500)
# Plasticity Simulation
plastro.random_walk_plasticity(full_ad, subset_ad, plastic_cells, walk_lengths)
plastro.cluster_switch_plasticity(full_ad, subset_ad, destination_clusters)
# Data Generation
plastro.create_random_binary_tree(n_leaves, sample_res)
plastro.generate_ad(sample_structure, n_dim)
plastro.simulate_lineage_tracing(sim_ad, terminal_ad)
# Distance Calculations
plastro.euclidean_distance(coordinates)
plastro.cosine_distance(coordinates)
plastro.manhattan_distance(coordinates)
plastro.archetype_distance(data, archetypes)
# Phylogenetic Analysis
plastro.neighbor_joining(distance_matrix, outgroup=None)
Core Modules
plastro.overlap: PLASTRO score computation and overlap analysisplastro.plasticity: Cellular plasticity simulation methodsplastro.lineage_simulation: CRISPR-based lineage tracing simulationplastro.phenotype_simulation: Synthetic single-cell data generationplastro.phylo: Phylogenetic tree construction and analysis
Complete Example Workflow
import plastro
import pandas as pd
# 1. Generate synthetic data (or load your own)
sample_structure = plastro.create_random_binary_tree(n_leaves=8, sample_res=50)
full_simulated_ad = plastro.generate_ad(sample_structure, n_dim=20)
ad = plastro.subset_to_terminal_branches(full_simulated_ad)
# 2. Simulate lineage tracing
cass_tree = plastro.simulate_lineage_tracing(
sim_ad=full_simulated_ad,
terminal_ad=ad,
latent_space_key='X_dc'
)
character_matrix = cass_tree.character_matrix
# 3. Simulate plasticity
plastic_cells = {'6': [cell1, cell2, cell3]} # Specific cells to make plastic
walk_lengths = {'6': 500}
plastic_ad = plastro.random_walk_plasticity(
full_simulated_ad, ad, plastic_cells, walk_lengths
)
# 4. Compute PLASTRO scores
original_scores = plastro.PLASTRO_score(
character_matrix, ad, flavor='gini'
)
plastic_scores = plastro.PLASTRO_score(
character_matrix, plastic_ad, flavor='gini'
)
# 5. Compare plasticity
print(f"Original mean Gini score: {original_scores['Gini_Index'].mean():.3f}")
print(f"Plastic mean Gini score: {plastic_scores['Gini_Index'].mean():.3f}")
Dependencies
Core requirements:
- Python ≥ 3.10
- pybind11 ≥ 2.6.0 (required for building graph-walker)
- graph-walker ≥ 1.0.6 (essential for random walk functionality)
- NumPy ≥ 1.20.0
- Pandas ≥ 1.3.0
- SciPy ≥ 1.7.0
- scikit-learn ≥ 1.0.0
- scikit-bio ≥ 0.5.7 (for robust neighbor-joining trees)
- NetworkX ≥ 2.6.0
- matplotlib ≥ 3.4.0
- scanpy ≥ 1.8.0
- anndata ≥ 0.8.0
- ete3 ≥ 3.1.2 (for phylogenetic tree manipulation)
- tqdm ≥ 4.60.0
- seaborn ≥ 0.11.0
- icecream ≥ 2.1.0
Optional dependencies:
cassiopeia-lineage(for advanced lineage tracing simulation)igraph(for Leiden clustering in phenotype simulation)
Data Requirements
PLASTRO works with:
- AnnData objects containing single-cell data with dimensionality reduction coordinates
- Character matrices (pandas DataFrame) from CRISPR lineage tracing with cells as rows
- Distance matrices for lineage and phenotypic relationships
- Cluster annotations (leiden, louvain, etc.) for cluster-based plasticity
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Documentation: plastro.readthedocs.io
- Issues: GitHub Issues
- Discussions: GitHub Discussions
PLASTRO - Comprehensive analysis of cellular plasticity in single-cell data
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plastro-0.1.3.tar.gz.
File metadata
- Download URL: plastro-0.1.3.tar.gz
- Upload date:
- Size: 39.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2cfe2039c7dc9b24edc1dc9d2dd83b4c3044f8bd4784e9b31edec29d8f8351c
|
|
| MD5 |
4ca775842cdb69770a51ba18502d126c
|
|
| BLAKE2b-256 |
59d591b16fa080d8b201157bb901d4e531ed6f3ee2436190ae1fdc0008c1601d
|
File details
Details for the file plastro-0.1.3-py3-none-any.whl.
File metadata
- Download URL: plastro-0.1.3-py3-none-any.whl
- Upload date:
- Size: 35.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ace0e61b39afb99c34802d86cab2cf75de4d876ed3af222ed485ea83f3858aa
|
|
| MD5 |
da9a863ad8df8ad2bd6d3254e58decdf
|
|
| BLAKE2b-256 |
742d6a1c339bf5e60365b0d6e52e62b8b88b59fc990fa0cf1548188eef4d577b
|