Skip to main content

Analyze, visualize, and interpret root traits output from sleap-roots.

Project description

SLEAP Roots Analyze

PyPI version Python: 3.11+ License: GPL v3 Tests: 1900+

Statistical analysis tools for root trait data from SLEAP Roots.

Installation

pip install sleap-roots-analyze

Or with uv:

uv add sleap-roots-analyze

Optional: input-contract validation

To validate analysis input against the sleap-roots-contracts schema at the data-load boundary, install the optional contracts extra:

pip install "sleap-roots-analyze[contracts]"

This enables the data.validate_input: off | warn | strict config flag for the QC pipeline, and the matching validate_input flag on CrossPlatformConfig for the cross-platform pipeline. When the extra is not installed, validation degrades to a logged no-op (never an error).

For development:

git clone https://github.com/talmolab/sleap-roots-analyze.git
cd sleap-roots-analyze
uv sync --group dev

Quick Start

New Analysis? Start here.

Use the interactive /configure-run-all slash command (in Claude Code) to create a complete, scientifically validated set of configs for a new analysis:

/configure-run-all

It inspects your CSV, walks you through every parameter with statistical guardrails, writes QC + Viz + run manifest configs, and commits them to git as a reproducibility anchor.

Optionally validate a specific config before running:

/validate-config configs/active/qc/<your_analysis>.yaml

Then run:

/run-pipelines --manifest configs/active/run_manifest_<your_analysis>.yaml

See configs/templates/README.md for manual config authoring.

Command-Line Interface

The package provides a CLI for running QC and visualization pipelines:

# Run QC pipeline
sleap-roots-analyze qc configs/qc_turface_150genotypes.yaml

# Run with custom output directory
sleap-roots-analyze qc configs/qc_turface_150genotypes.yaml -o ./my_results

# Validate configuration
sleap-roots-analyze config validate configs/qc_turface_150genotypes.yaml

# List example configs
sleap-roots-analyze config list

# Run all pipelines from a manifest
sleap-roots-analyze run-all configs/active/run_manifest.yaml

# Get help
sleap-roots-analyze --help
sleap-roots-analyze qc --help

See docs/QC_PIPELINE_GUIDE.md for a complete guide to using the QC pipeline.

Python API

Load and Clean Data

from sleap_roots_analyze.data_cleanup import (
    load_trait_data,
    get_trait_columns,
    remove_nan_samples,
)

# Load data
df = load_trait_data("path/to/traits.csv")

# Get trait columns (excludes metadata automatically)
trait_cols = get_trait_columns(df)

# Remove samples with >20% missing data
df_clean, df_removed, stats = remove_nan_samples(
    df, trait_cols, max_nan_fraction=0.2
)

Calculate Heritability

from sleap_roots_analyze.statistics import calculate_heritability_estimates

# Calculate heritability for all traits
h2_results = calculate_heritability_estimates(
    df_clean,
    trait_cols,
    genotype_col="geno",
    replicate_col="rep"
)

# Filter low heritability traits
h2_results, df_filtered, removed, details = calculate_heritability_estimates(
    df_clean,
    trait_cols,
    remove_low_h2=True,
    h2_threshold=0.3
)

PCA Analysis

from sleap_roots_analyze.pca import perform_pca_analysis

# Run PCA with automatic component selection
result = perform_pca_analysis(
    df_filtered,
    standardize=True,
    explained_variance_threshold=0.95
)

# Access results
pca_model = result['pca']
transformed_data = result['transformed_data']
loadings = result['loadings']

Outlier Detection

from sleap_roots_analyze.outlier_detection import (
    detect_outliers_mahalanobis,
    detect_outliers_isolation_forest,
    remove_outliers_from_data
)

# Detect outliers using Mahalanobis distance
outliers_maha = detect_outliers_mahalanobis(
    df_filtered[trait_cols],
    use_robust=True
)

# Or use Isolation Forest for complex patterns
outliers_iso = detect_outliers_isolation_forest(
    df_filtered[trait_cols],
    contamination=0.1
)

# Remove outliers from data
df_clean, df_outliers = remove_outliers_from_data(
    df_filtered,
    outliers_maha['outlier_indices'],
    return_outliers=True
)

Visualization

from sleap_roots_analyze.visualization import (
    create_heritability_plot,
    create_pca_biplot,
    create_feature_contribution_heatmap,
    create_phenotype_variation_plot,
    save_publication_figure
)

# Create heritability plot
fig = create_heritability_plot(h2_results, threshold=0.3)

# Create PCA biplot
fig_biplot = create_pca_biplot(
    pca_result,
    color_by="geno",
    metadata_df=df_filtered[["Barcode", "geno"]]
)

# Create feature contribution heatmap
fig_heatmap = create_feature_contribution_heatmap(
    pca_result['feature_contributions'],
    n_components=5
)

# Save in publication format
save_publication_figure(fig, "heritability", formats=["pdf", "png"])

Comprehensive PCA Analysis with Export

from sleap_roots_analyze.pca import run_pca_and_export_artifacts

# Run comprehensive PCA analysis with CSV exports
results = run_pca_and_export_artifacts(
    df_filtered,
    trait_cols=trait_cols,
    analysis_dir="pca_results",
    n_components=10,
    save_csv=True,
    save_prefix="experiment1_"
)

# Access results DataFrames
loadings_df = results['loadings_df']
pc_scores_df = results['pc_scores_df']
variance_df = results['variance_explained_df']
contributions_df = results['trait_variance_contributions_df']

Interactive Visualization

from sleap_roots_analyze.interactive_visualization import (
    create_interactive_pca_with_images,
    create_interactive_umap_with_hover_highlight,
    create_trait_explorer_dashboard,
    create_interactive_image_gallery
)

# Create interactive PCA with sample images
fig = create_interactive_pca_with_images(
    pca_result,
    image_paths,  # Dict mapping sample IDs to image paths
    show_images=True,
    metadata_df=df_filtered[["Barcode", "geno"]]
)

# Interactive UMAP with hover highlights
fig_umap = create_interactive_umap_with_hover_highlight(
    umap_result,
    highlight_on_hover=True,
    size=8
)

# Create comprehensive trait explorer dashboard
dashboard = create_trait_explorer_dashboard(
    df_filtered,
    trait_cols,
    groupby_col="geno"
)

# Generate interactive HTML gallery with images
html = create_interactive_image_gallery(
    image_paths,
    metadata_df=df_filtered[["Barcode", "geno", "trait1"]],
    images_per_row=4,
    image_width=200
)

Features

  • Data Cleaning: Automatic metadata detection, NaN handling, zero-inflated trait removal
  • Statistical Analysis: Broad-sense heritability (H²), ANOVA, trait statistics
  • PCA Analysis: Dimensionality reduction with automatic component selection, comprehensive export artifacts
  • Outlier Detection: Mahalanobis, PCA reconstruction, and Isolation Forest methods
  • Visualization: Publication-ready plots for heritability, PCA, outliers, and phenotype variation
  • Interactive Visualization: Plotly-based interactive plots with image integration and hover effects
  • UMAP Analysis: Non-linear dimensionality reduction for complex trait relationships
  • Cross-Experiment Analysis: Compare and correlate data across multiple experiments

Data Format

Expected CSV structure:

Barcode,geno,rep,trait1,trait2,trait3,...
BC001,Genotype1,1,100.5,200.3,50.2,...
BC002,Genotype1,2,102.3,195.8,48.9,...

Required columns:

  • Genotype: geno (configurable)
  • Replicate: rep (configurable)
  • Sample ID: Barcode (configurable)
  • Traits: Any numeric columns

Development

# Run tests
uv run pytest

# Format code
uv run black src tests

# Lint code
uv run ruff check src tests

# Coverage report
uv run pytest --cov --cov-branch

Project Structure

sleap-roots-analyze/
├── src/sleap_roots_analyze/
│   ├── cli.py                        # Command-line interface
│   ├── data_cleanup.py               # Data loading and cleaning
│   ├── statistics.py                 # Statistical analysis
│   ├── pca.py                        # PCA analysis
│   ├── outlier_detection.py          # Outlier detection
│   ├── visualization.py              # Plotting and visualization
│   ├── outlier_visualization.py      # Outlier-specific plots
│   ├── interactive_visualization.py  # Interactive Plotly visualizations
│   ├── cross_experiment_analysis.py  # Cross-experiment comparisons
│   ├── depth_profile_plots.py        # Depth profile visualizations
│   ├── pipeline_runner.py            # Pipeline orchestration (run-all)
│   ├── umap.py                       # UMAP dimensionality reduction
│   ├── data_utils.py                 # Utility functions
│   └── pipeline/                     # QC/Viz pipeline steps
├── configs/                     # Pipeline configurations
│   ├── active/                  # Active run manifests
│   └── examples/                # Example configs for different use cases
├── tests/                       # Test suite (1900+ tests)
├── docs/                        # Documentation
└── pyproject.toml              # Project configuration

License

GNU General Public License v3.0 - see LICENSE file.

Citation

@software{sleap_roots_analyze,
  title = {SLEAP Roots Analyze},
  author = {Elizabeth Berrigan},
  year = {2026},
  url = {https://github.com/talmolab/sleap-roots-analyze}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sleap_roots_analyze-0.1.0a3.tar.gz (305.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sleap_roots_analyze-0.1.0a3-py3-none-any.whl (360.3 kB view details)

Uploaded Python 3

File details

Details for the file sleap_roots_analyze-0.1.0a3.tar.gz.

File metadata

  • Download URL: sleap_roots_analyze-0.1.0a3.tar.gz
  • Upload date:
  • Size: 305.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sleap_roots_analyze-0.1.0a3.tar.gz
Algorithm Hash digest
SHA256 d3f935f8f14700928e8f23625818c50f05d4a6f949f44f715478136e1333115b
MD5 85c8945265a1ecfb44faf297bf015018
BLAKE2b-256 218f2902c9a251ba8030feacc6dd733af4c8ce31048bfc4046bfce565ae8d42b

See more details on using hashes here.

File details

Details for the file sleap_roots_analyze-0.1.0a3-py3-none-any.whl.

File metadata

  • Download URL: sleap_roots_analyze-0.1.0a3-py3-none-any.whl
  • Upload date:
  • Size: 360.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sleap_roots_analyze-0.1.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 3b9e0113970a336fda985e3097fab03b697830d791673333c42a7d8a0eee7bb8
MD5 8c4a18b5d3b0797bad9339a03179f841
BLAKE2b-256 33dc44050b13fd2323380cdff1a2cc75b34da6b585c608caba1b445d1bd52ddd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page