Analyze, visualize, and interpret root traits output from sleap-roots.
Project description
SLEAP Roots Analyze
Statistical analysis tools for root trait data from SLEAP Roots.
Installation
pip install sleap-roots-analyze
Or with uv:
uv add sleap-roots-analyze
For development:
git clone https://github.com/talmolab/sleap-roots-analyze.git
cd sleap-roots-analyze
uv sync --group dev
Quick Start
New Analysis? Start here.
Use the interactive /configure-run-all slash command (in Claude Code) to create a complete,
scientifically validated set of configs for a new analysis:
/configure-run-all
It inspects your CSV, walks you through every parameter with statistical guardrails, writes QC + Viz + run manifest configs, and commits them to git as a reproducibility anchor.
Optionally validate a specific config before running:
/validate-config configs/active/qc/<your_analysis>.yaml
Then run:
/run-pipelines --manifest configs/active/run_manifest_<your_analysis>.yaml
See configs/templates/README.md for manual config authoring.
Command-Line Interface
The package provides a CLI for running QC and visualization pipelines:
# Run QC pipeline
sleap-roots-analyze qc configs/qc_turface_150genotypes.yaml
# Run with custom output directory
sleap-roots-analyze qc configs/qc_turface_150genotypes.yaml -o ./my_results
# Validate configuration
sleap-roots-analyze config validate configs/qc_turface_150genotypes.yaml
# List example configs
sleap-roots-analyze config list
# Run all pipelines from a manifest
sleap-roots-analyze run-all configs/active/run_manifest.yaml
# Get help
sleap-roots-analyze --help
sleap-roots-analyze qc --help
See docs/QC_PIPELINE_GUIDE.md for a complete guide to using the QC pipeline.
Python API
Load and Clean Data
from sleap_roots_analyze.data_cleanup import (
load_trait_data,
get_trait_columns,
remove_nan_samples,
)
# Load data
df = load_trait_data("path/to/traits.csv")
# Get trait columns (excludes metadata automatically)
trait_cols = get_trait_columns(df)
# Remove samples with >20% missing data
df_clean, df_removed, stats = remove_nan_samples(
df, trait_cols, max_nan_fraction=0.2
)
Calculate Heritability
from sleap_roots_analyze.statistics import calculate_heritability_estimates
# Calculate heritability for all traits
h2_results = calculate_heritability_estimates(
df_clean,
trait_cols,
genotype_col="geno",
replicate_col="rep"
)
# Filter low heritability traits
h2_results, df_filtered, removed, details = calculate_heritability_estimates(
df_clean,
trait_cols,
remove_low_h2=True,
h2_threshold=0.3
)
PCA Analysis
from sleap_roots_analyze.pca import perform_pca_analysis
# Run PCA with automatic component selection
result = perform_pca_analysis(
df_filtered,
standardize=True,
explained_variance_threshold=0.95
)
# Access results
pca_model = result['pca']
transformed_data = result['transformed_data']
loadings = result['loadings']
Outlier Detection
from sleap_roots_analyze.outlier_detection import (
detect_outliers_mahalanobis,
detect_outliers_isolation_forest,
remove_outliers_from_data
)
# Detect outliers using Mahalanobis distance
outliers_maha = detect_outliers_mahalanobis(
df_filtered[trait_cols],
use_robust=True
)
# Or use Isolation Forest for complex patterns
outliers_iso = detect_outliers_isolation_forest(
df_filtered[trait_cols],
contamination=0.1
)
# Remove outliers from data
df_clean, df_outliers = remove_outliers_from_data(
df_filtered,
outliers_maha['outlier_indices'],
return_outliers=True
)
Visualization
from sleap_roots_analyze.visualization import (
create_heritability_plot,
create_pca_biplot,
create_feature_contribution_heatmap,
create_phenotype_variation_plot,
save_publication_figure
)
# Create heritability plot
fig = create_heritability_plot(h2_results, threshold=0.3)
# Create PCA biplot
fig_biplot = create_pca_biplot(
pca_result,
color_by="geno",
metadata_df=df_filtered[["Barcode", "geno"]]
)
# Create feature contribution heatmap
fig_heatmap = create_feature_contribution_heatmap(
pca_result['feature_contributions'],
n_components=5
)
# Save in publication format
save_publication_figure(fig, "heritability", formats=["pdf", "png"])
Comprehensive PCA Analysis with Export
from sleap_roots_analyze.pca import run_pca_and_export_artifacts
# Run comprehensive PCA analysis with CSV exports
results = run_pca_and_export_artifacts(
df_filtered,
trait_cols=trait_cols,
analysis_dir="pca_results",
n_components=10,
save_csv=True,
save_prefix="experiment1_"
)
# Access results DataFrames
loadings_df = results['loadings_df']
pc_scores_df = results['pc_scores_df']
variance_df = results['variance_explained_df']
contributions_df = results['trait_variance_contributions_df']
Interactive Visualization
from sleap_roots_analyze.interactive_visualization import (
create_interactive_pca_with_images,
create_interactive_umap_with_hover_highlight,
create_trait_explorer_dashboard,
create_interactive_image_gallery
)
# Create interactive PCA with sample images
fig = create_interactive_pca_with_images(
pca_result,
image_paths, # Dict mapping sample IDs to image paths
show_images=True,
metadata_df=df_filtered[["Barcode", "geno"]]
)
# Interactive UMAP with hover highlights
fig_umap = create_interactive_umap_with_hover_highlight(
umap_result,
highlight_on_hover=True,
size=8
)
# Create comprehensive trait explorer dashboard
dashboard = create_trait_explorer_dashboard(
df_filtered,
trait_cols,
groupby_col="geno"
)
# Generate interactive HTML gallery with images
html = create_interactive_image_gallery(
image_paths,
metadata_df=df_filtered[["Barcode", "geno", "trait1"]],
images_per_row=4,
image_width=200
)
Features
- Data Cleaning: Automatic metadata detection, NaN handling, zero-inflated trait removal
- Statistical Analysis: Broad-sense heritability (H²), ANOVA, trait statistics
- PCA Analysis: Dimensionality reduction with automatic component selection, comprehensive export artifacts
- Outlier Detection: Mahalanobis, PCA reconstruction, and Isolation Forest methods
- Visualization: Publication-ready plots for heritability, PCA, outliers, and phenotype variation
- Interactive Visualization: Plotly-based interactive plots with image integration and hover effects
- UMAP Analysis: Non-linear dimensionality reduction for complex trait relationships
- Cross-Experiment Analysis: Compare and correlate data across multiple experiments
Data Format
Expected CSV structure:
Barcode,geno,rep,trait1,trait2,trait3,...
BC001,Genotype1,1,100.5,200.3,50.2,...
BC002,Genotype1,2,102.3,195.8,48.9,...
Required columns:
- Genotype:
geno(configurable) - Replicate:
rep(configurable) - Sample ID:
Barcode(configurable) - Traits: Any numeric columns
Development
# Run tests
uv run pytest
# Format code
uv run black src tests
# Lint code
uv run ruff check src tests
# Coverage report
uv run pytest --cov --cov-branch
Project Structure
sleap-roots-analyze/
├── src/sleap_roots_analyze/
│ ├── cli.py # Command-line interface
│ ├── data_cleanup.py # Data loading and cleaning
│ ├── statistics.py # Statistical analysis
│ ├── pca.py # PCA analysis
│ ├── outlier_detection.py # Outlier detection
│ ├── visualization.py # Plotting and visualization
│ ├── outlier_visualization.py # Outlier-specific plots
│ ├── interactive_visualization.py # Interactive Plotly visualizations
│ ├── cross_experiment_analysis.py # Cross-experiment comparisons
│ ├── depth_profile_plots.py # Depth profile visualizations
│ ├── pipeline_runner.py # Pipeline orchestration (run-all)
│ ├── umap.py # UMAP dimensionality reduction
│ ├── data_utils.py # Utility functions
│ └── pipeline/ # QC/Viz pipeline steps
├── configs/ # Pipeline configurations
│ ├── active/ # Active run manifests
│ └── examples/ # Example configs for different use cases
├── tests/ # Test suite (1900+ tests)
├── docs/ # Documentation
└── pyproject.toml # Project configuration
License
GNU General Public License v3.0 - see LICENSE file.
Citation
@software{sleap_roots_analyze,
title = {SLEAP Roots Analyze},
author = {Elizabeth Berrigan},
year = {2026},
url = {https://github.com/talmolab/sleap-roots-analyze}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sleap_roots_analyze-0.1.0a2.tar.gz.
File metadata
- Download URL: sleap_roots_analyze-0.1.0a2.tar.gz
- Upload date:
- Size: 265.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
456cd2a15a828620346edfb6a7d93f27eefdde2605474fabb8440f88e56db3f5
|
|
| MD5 |
c9c7bf8b114add41e9a342c083c5ff74
|
|
| BLAKE2b-256 |
306d5242b5a22f3f7a49109b767e79f68f987ec9f8dd772ffc23922871ec9fb2
|
File details
Details for the file sleap_roots_analyze-0.1.0a2-py3-none-any.whl.
File metadata
- Download URL: sleap_roots_analyze-0.1.0a2-py3-none-any.whl
- Upload date:
- Size: 314.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a8655354efff394d9635bced47ca7292a62e3787202ea7599d6bf719c1a070e
|
|
| MD5 |
81b0ca3bf9e221c5026db9faaf0876e9
|
|
| BLAKE2b-256 |
ac15a3c67d2fbae81915b14a02a233c6ccba8d76e82ea24112c1652272fd0ba0
|