Skip to main content

A Python package for simulating cellular plasticity in single-cell data

Project description

PLASTRO

PyPI version Conda version Documentation Status License: MIT

PLASTRO is a Python package for simulating and analyzing cellular plasticity in single-cell data. It provides comprehensive tools for studying how cells transition between different phenotypic states and how these transitions relate to lineage relationships.

Key Features

  • Plasticity Simulation: Random walk plasticity and cluster-based transitions
  • Lineage Tracing Integration: CRISPR-based lineage tracing simulation with Cassiopeia
  • PLASTRO Score: Novel overlap-based metrics for quantifying cellular plasticity.
  • Phylogenetic Analysis: Neighbor-joining tree construction from single-cell data
  • Data Simulation: Generate realistic synthetic datasets with branching differentiation
  • High Performance: Optimized overlap computation (10-100x speedup over naive methods)

Installation

Quick Install

pip install plastro

Conda Install (coming soon)

conda install -c conda-forge plastro

Development Install

git clone https://github.com/dpeerlab/PLASTRO.git
cd plastro
pip install -e .

Quick Start

Basic PLASTRO Score Computation

import plastro
import pandas as pd

# Load your single-cell data and lineage tracing data
character_matrix = pd.read_csv('character_matrix.csv', index_col=0)
adata = plastro.load_data('single_cell_data.h5ad')

# Compute Gini-based plasticity scores (recommended)
gini_scores = plastro.PLASTRO_score(
    character_matrix=character_matrix,
    ad=adata,
    flavor='gini',
    latent_space_key='X_dc'  # or 'X_pca', 'X_umap'
)

print(f"Mean Gini plasticity score: {gini_scores['Gini_Index'].mean():.3f}")

Generate Synthetic Data with Plasticity

# Create synthetic single-cell dataset
n_leaves = 8
sample_res = 50  
n_dim = 20

# Generate branching structure
sample_structure = plastro.create_random_binary_tree(n_leaves, sample_res)
full_simulated_ad = plastro.generate_ad(sample_structure, n_dim)

# Subset to terminal branches
ad = plastro.subset_to_terminal_branches(full_simulated_ad)

# Simulate lineage tracing
cass_tree = plastro.simulate_lineage_tracing(
    sim_ad=full_simulated_ad, 
    terminal_ad=ad,
    latent_space_key='X_dc'
)

Simulate Cellular Plasticity

# Random walk plasticity
plastic_cells = {'6': 0.3, '5': 0.2}  # 30% of cluster 6, 20% of cluster 5
walk_lengths = {'6': 500, '5': 1000}
plastic_walk_ad = plastro.random_walk_plasticity(
    full_simulated_ad, ad, plastic_cells, walk_lengths
)

# Cluster switch plasticity  
destination_clusters = {
    '11': {'destination': '7', 'proportion': 0.4},
    '6': {'destination': '10', 'proportion': 0.2}
}
plastic_leiden_ad = plastro.cluster_switch_plasticity(
    full_simulated_ad, ad, destination_clusters, column='leiden'
)

Documentation & Examples

Example Notebooks

Complete example workflows are available in docs/notebooks/:

  1. Plasticity Simulation Example:

    • Generate synthetic single-cell data with branching differentiation
    • Simulate CRISPR-based lineage tracing
    • Apply random walk and cluster switch plasticity
    • Visualize phenotypic changes
  2. PLASTRO Overlap Analysis:

    • In-depth analysis of lineage-phenotype relationships
    • Detailed explanation of overlap computation methods
    • Interpretation of PLASTRO scores

Key Documentation Sections

Core API

Main Functions

# PLASTRO Score Computation
plastro.PLASTRO_score(character_matrix, ad, flavor='gini')
plastro.PLASTRO_overlaps(character_matrix, ad, maximum_radius=500)

# Plasticity Simulation
plastro.random_walk_plasticity(full_ad, subset_ad, plastic_cells, walk_lengths)
plastro.cluster_switch_plasticity(full_ad, subset_ad, destination_clusters)

# Data Generation
plastro.create_random_binary_tree(n_leaves, sample_res)
plastro.generate_ad(sample_structure, n_dim)
plastro.simulate_lineage_tracing(sim_ad, terminal_ad)

# Distance Calculations  
plastro.euclidean_distance(coordinates)
plastro.cosine_distance(coordinates)
plastro.manhattan_distance(coordinates)
plastro.archetype_distance(data, archetypes)

# Phylogenetic Analysis
plastro.neighbor_joining(distance_matrix, outgroup=None)

Core Modules

  • plastro.overlap: PLASTRO score computation and overlap analysis
  • plastro.plasticity: Cellular plasticity simulation methods
  • plastro.lineage_simulation: CRISPR-based lineage tracing simulation
  • plastro.phenotype_simulation: Synthetic single-cell data generation
  • plastro.distances: Phenotypic distance calculations
  • plastro.phylo: Phylogenetic tree construction and analysis

Complete Example Workflow

import plastro
import pandas as pd

# 1. Generate synthetic data (or load your own)
sample_structure = plastro.create_random_binary_tree(n_leaves=8, sample_res=50)
full_simulated_ad = plastro.generate_ad(sample_structure, n_dim=20)
ad = plastro.subset_to_terminal_branches(full_simulated_ad)

# 2. Simulate lineage tracing
cass_tree = plastro.simulate_lineage_tracing(
    sim_ad=full_simulated_ad,
    terminal_ad=ad,
    latent_space_key='X_dc'
)
character_matrix = cass_tree.character_matrix

# 3. Simulate plasticity
plastic_cells = {'6': [cell1, cell2, cell3]}  # Specific cells to make plastic
walk_lengths = {'6': 500}
plastic_ad = plastro.random_walk_plasticity(
    full_simulated_ad, ad, plastic_cells, walk_lengths
)

# 4. Compute PLASTRO scores
original_scores = plastro.PLASTRO_score(
    character_matrix, ad, flavor='gini'
)
plastic_scores = plastro.PLASTRO_score(
    character_matrix, plastic_ad, flavor='gini'
)

# 5. Compare plasticity
print(f"Original mean Gini score: {original_scores['Gini_Index'].mean():.3f}")
print(f"Plastic mean Gini score: {plastic_scores['Gini_Index'].mean():.3f}")

Dependencies

Core requirements:

  • Python ≥ 3.8
  • NumPy ≥ 1.20.0
  • Pandas ≥ 1.3.0
  • SciPy ≥ 1.7.0
  • scikit-learn ≥ 1.0.0
  • NetworkX ≥ 2.6.0
  • matplotlib ≥ 3.4.0
  • scanpy ≥ 1.8.0
  • tqdm ≥ 4.60.0

Optional dependencies:

  • cassiopeia-lineage (for lineage tracing simulation)
  • ete3 (for phylogenetic tree manipulation)
  • scikit-bio (for robust neighbor-joining trees)
  • numba (for additional performance optimization)
  • faiss-cpu (for approximate nearest neighbors on large datasets)

Data Requirements

PLASTRO works with:

  • AnnData objects containing single-cell data with dimensionality reduction coordinates
  • Character matrices (pandas DataFrame) from CRISPR lineage tracing with cells as rows
  • Distance matrices for lineage and phenotypic relationships
  • Cluster annotations (leiden, louvain, etc.) for cluster-based plasticity

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support


PLASTRO - Comprehensive analysis of cellular plasticity in single-cell data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plastro-0.1.2.tar.gz (39.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plastro-0.1.2-py3-none-any.whl (35.4 kB view details)

Uploaded Python 3

File details

Details for the file plastro-0.1.2.tar.gz.

File metadata

  • Download URL: plastro-0.1.2.tar.gz
  • Upload date:
  • Size: 39.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for plastro-0.1.2.tar.gz
Algorithm Hash digest
SHA256 f8c694ead5b6eb0d77191950776320ba5a89692b8c2d9ce44e4108202e9fb9d5
MD5 7059226b0b7fd4c1aac00b1391b0b574
BLAKE2b-256 ee3bb9aed008a0e5aa9a9981efcdbf0e2e3eae1c890f75f087b0d215ac7f1e0e

See more details on using hashes here.

File details

Details for the file plastro-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: plastro-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 35.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for plastro-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d79f4d01027d601d4f61be33c406b8797cd98412f48458f05cf3c3519997b5a7
MD5 27de8a4e7611d12fdeae3c039db7f55c
BLAKE2b-256 4092968a200b982c87123527021782dd8357f1426d66fd4083f09078bf1fe9ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page