Skip to main content

GEMspa: Advanced single-particle tracking analysis pipeline with MSD, diffusion analysis, and visualization tools

Project description

GEMspa-CLI: Single-Particle Tracking Analysis (v2.0.0)

Advanced, modular single-particle tracking (SPT) and ensemble diffusion analysis for microscopy data.


Overview

gemspa-cli is a command-line interface for GEMspa, a modular single-particle tracking and diffusion analysis suite. It performs trajectory extraction, per-track MSD fitting, ensemble averaging, step-size statistics, and condition-wise comparisons from single-molecule tracking data.

New in v2.0.0

  • Condition-grouped metrics — automatically generates split CSV files per condition
  • Steps and tracks analysis — step-size vs. brightness heatmaps and colored track overlays
  • Flexible condition extraction — strips date codes and handles arbitrary dataset naming conventions
  • Unified filtering system — consistent D and α filtering across all pipeline stages
  • Enhanced CLI — new flags for advanced visualization and analysis

Features

  • Advanced group analysis with automatic metrics splitting by condition
  • TrackMate cleaning utility (--clean-trackmate)
  • Unified global filtering of D and α applied consistently across the entire pipeline
  • Step-size vs. brightness heatmaps with customizable binning and color scaling
  • Track overlays colored by step size for visual quality control
  • Cross-condition comparison plots with KS-test statistical annotations

Installation

From PyPI (recommended)

pip install GEMspa-CLI

Virtual environment (optional but recommended)

python3 -m venv ~/venvs/gemspa
source ~/venvs/gemspa/bin/activate

Windows PowerShell:

python -m venv %USERPROFILE%\venvs\gemspa
%USERPROFILE%\venvs\gemspa\Scripts\Activate.ps1

Input File Formats

GEMspa-CLI accepts trajectory data in CSV format.

TrackMate export format

  • File pattern: *Spots in tracks*.csv (pass with --csv-pattern "*Spots in tracks*.csv")
  • Required columns: POSITION_X, POSITION_Y, FRAME, TRACK_ID
  • Optional columns: MEAN_INTENSITY_CH1 (for brightness analysis)

GEMspa format

  • File pattern: Traj_*.csv (default)
  • Required columns: x, y, frame, track_id
  • Optional columns: brightness (for step-size analysis)

Naming conventions

GEMspa-CLI automatically strips date codes and replicate suffixes to extract condition labels:

  • Traj_20220706_G12V_4.csv → condition: G12V
  • Traj_20220708_G12V_13.csv → condition: G12V (correctly pooled with the above)
  • Traj_20220706_HKWT_2.csv → condition: HKWT

Pipeline Stages

1. Data discovery and preparation

  • Scans the working directory for trajectory CSV files
  • Optionally cleans TrackMate exports to GEMspa column format
  • Groups files by condition, removing date codes automatically

2. Per-replicate analysis

  • Loads each trajectory file and computes per-track MSD curves
  • Fits diffusion parameters (D, α, r²) for every track
  • Writes per-replicate result tables and diagnostic plots

3. Ensemble analysis

  • Pools tracks by condition (raw and filtered variants)
  • Computes ensemble-averaged MSD curves per condition
  • Applies global filter parameters consistently

4. Advanced group analysis (default: ON)

  • Computes VACF, confinement index, convex-hull area, and tortuosity
  • Splits all metrics by condition into separate CSV files
  • Generates comprehensive condition-level visualizations

5. Steps and tracks analysis (optional: --steps-tracks)

  • Produces step-size vs. brightness heatmaps
  • Generates track overlays colored by step size
  • Supports both per-file and pooled outputs

6. Cross-condition comparison

  • KS tests and boxplots comparing conditions
  • Publication-ready figures saved to comparison/

Workflow Overview

SMT data export  -->  gemspa-cli pipeline
        |
        +-- (optional) --clean-trackmate
        |        --> standardized Traj_*.csv
        |
        +-- Per-replicate trajectory analysis
        |        --> D_fit, alpha_fit, r2, MSD curves, rainbow overlay
        |
        +-- Ensemble pooling by condition
        |        --> grouped_raw/ and grouped_filtered/
        |
        +-- (optional) --step-size-analysis
        |        --> KDEs, non-Gaussian (alpha2) stats, KS tests
        |
        +-- (optional) --steps-tracks
        |        --> brightness_stepsize/   (heatmaps)
        |        --> tracks_stepsize_map/   (track overlays)
        |
        +-- Advanced group analysis (default: ON)
        |        --> grouped_advanced_analysis/
        |        --> split_metrics_by_condition/
        |
        +-- Cross-condition comparison
                 --> comparison/*.png

Core Analysis

1. Mean-square displacement (MSD)

For a trajectory of N frames, the time-averaged MSD at lag tau is:

MSD(tau) = < (x[i+tau] - x[i])^2 + (y[i+tau] - y[i])^2 >_i

where tau = frame_lag x dt (set by --time-step), and the average is taken over all valid frame pairs i.


2. Diffusion coefficient (D)

Estimated from a linear fit to the early MSD regime:

MSD(tau) ~ 4 * D * tau   =>   D = (1/4) * d(MSD)/d(tau)

Units: µm²/s


3. Anomalous exponent (alpha)

Extracted from the log-log slope of MSD vs. tau:

log10[ MSD(tau) ] = alpha * log10(tau) + log10(4D)
  • alpha ≈ 1: normal (Brownian) diffusion
  • alpha < 1: subdiffusive (confined or hindered)
  • alpha > 1: superdiffusive (directed or active)

4. Non-Gaussian parameter (alpha2)

Quantifies deviation from Gaussian (Brownian) step-size statistics:

alpha2 = <r^4> / (3 * <r^2>^2) - 1

alpha2 = 0 for a Gaussian distribution; positive values indicate heterogeneous or anomalous dynamics.


5. Velocity autocorrelation function (VACF)

Used in advanced group analysis to assess directional persistence:

VACF(k) = < v_i . v_{i+k} > / < v_i . v_i >

where v_i is the displacement vector at step i and k is the lag in frames.


Usage

Graphical interface

gemspa-gui

The GUI exposes all CLI parameters through organized input sections, includes real-time validation, and supports saving and loading parameter sets for reproducible analysis.

Command-line interface

gemspa-cli -d /path/to/folder [options]

Required

  • -d, --work-dir — directory containing trajectory CSV files

Input discovery

  • --csv-pattern — glob for input CSVs (default: Traj_*.csv); for TrackMate use "*Spots in tracks*.csv"

Acquisition and units

  • --time-step — seconds per frame
  • --micron-per-px — pixel size in µm

Track and fit constraints

  • --min-track-len — minimum frames per track
  • --tlag-cutoff — maximum lag in frames used for MSD fitting

Parallelism

  • -j, --n-jobs — parallel processes across replicates
  • --threads-per-rep — threads per replicate

Rainbow track overlays (optional)

  • --rainbow-tracks — enable D-colored track overlays
  • --img-prefix — prefix for background TIFF images (e.g., MAX_)
  • --rainbow-min-D, --rainbow-max-D — D range for colormap
  • --rainbow-colormap, --rainbow-scale, --rainbow-dpi

Ensemble filters (applied globally)

  • --filter-D-min, --filter-D-max — D bounds (µm²/s)
  • --filter-alpha-min, --filter-alpha-max — alpha bounds

Optional analyses

  • --step-size-analysis — step-size KDE and KS plots
  • --clean-trackmate — run TrackMate CSV cleaner and exit
  • --no-advanced-group — skip the advanced group analysis stage

Steps and tracks analysis

  • --steps-tracks — enable step-size vs. brightness heatmaps and track overlays
  • --steps-tracks-mode {both,heatmaps,tracks} — what to generate (default: both)
  • --stepsize-max — max step size for plots and LUT in pixels (default: 3.0)
  • --bins-x, --bins-y — heatmap bin counts (default: 150 each)
  • --count-cap — count cap for heatmap color scale (default: 300)
  • --line-width — line width for track overlays (default: 0.7)
  • --min-track-length — minimum track length for overlays (default: 10)
  • --brightness-col — brightness column name (default: MEAN_INTENSITY_CH1)
  • --invert-lut-tracks — invert the colormap for track overlays
  • --strip-datecodes — strip date codes from output filenames (default: true)

Outputs

Per replicate (<COND>_<REP>/)

  • msd_results.csv — per-track D_fit, alpha_fit, r2_fit
  • msd_vs_tau.png — linear MSD vs. tau with D estimate
  • msd_vs_tau_loglog.png — log-log MSD vs. tau with alpha slope
  • D_fit_distribution.png — histogram of D (log x-axis)
  • alpha_vs_logD.png — scatter of alpha vs. log10(D)
  • rainbow_tracks.png — D-colored trajectory overlay (if enabled)

Ensemble level

  • grouped_raw/ and grouped_filtered/ subdirectories
  • Ensemble-averaged MSD plots (ensemble_msd_vs_tau_<COND>.png)
  • Step-size KDEs (step_kde_<COND>_(ensemble).png and filtered variants)

Comparison (comparison/)

  • ensemble_filtered_D_histograms.png — log-scale D distributions with KS annotation
  • ensemble_filtered_alpha_histograms.png — alpha distributions with KS annotation
  • replicate_median_D_boxplot.png — per-replicate median D by condition

Advanced group analysis (grouped_advanced_analysis/)

Runs automatically unless --no-advanced-group is specified.

Per-track metrics table columns:

track_id, condition, D_fit, alpha_fit, r2_fit, vacf_lag1,
confinement_idx, hull_area_um2, tortuosity, n_frames

Plots: D_fit and alpha_fit box/violin plots by condition, VACF histograms and mean curves, convex-hull area vs. tortuosity scatterplots.

Split metrics (split_metrics_by_condition/) — one CSV per metric, conditions as columns:

  • D_fit.csv, alpha_fit.csv, r2_fit.csv
  • vacf_lag1.csv, confinement_idx.csv
  • hull_area_um2.csv, tortuosity.csv, n_frames.csv
  • _index.csv — parameter-to-file mapping

Steps and tracks analysis

Generated when --steps-tracks is specified.

Heatmaps (brightness_stepsize/):

  • heatmap_all.png, heatmap_<filename>.png
  • steps_vs_brightness_all.csv, steps_vs_brightness_<filename>.csv

Track overlays (tracks_stepsize_map/):

  • overlay_all.png, overlay_<filename>.png
  • tracks_stepsize_combined.pdf

TrackMate cleaner (--clean-trackmate)

Normalizes TrackMate exports to GEMspa column format (x, y, frame, track_id), writing cleaned files in place while preserving original filenames.

Options:

  • --clean-out-dir — write outputs to a separate directory
  • --clean-include-date — include date codes in output names (YYMMDD or YYYYMMDD)
  • --clean-move — move rather than copy when renaming legacy Traj_* files
  • --clean-dry-run — preview actions without writing any files

Example Commands

# Clean TrackMate CSVs only
gemspa-cli -d /data/TrackMateExports --clean-trackmate

# Standard run with physical units
gemspa-cli -d /data/GEMspa --time-step 0.03 --micron-per-px 0.11 \
  --min-track-len 4 --tlag-cutoff 4

# Add step-size KDEs and rainbow overlays
gemspa-cli -d /data/GEMspa --rainbow-tracks --step-size-analysis

# Steps and tracks analysis with default settings
gemspa-cli -d /data/GEMspa --steps-tracks --time-step 0.01 --micron-per-px 0.11

# Heatmaps only with custom binning
gemspa-cli -d /data/GEMspa --steps-tracks --steps-tracks-mode heatmaps \
  --stepsize-max 5.0 --bins-x 200 --bins-y 200 --count-cap 500

# Track overlays only with inverted colormap
gemspa-cli -d /data/GEMspa --steps-tracks --steps-tracks-mode tracks \
  --line-width 1.0 --invert-lut-tracks --min-track-length 15

# Skip advanced analysis
gemspa-cli -d /data/GEMspa --no-advanced-group

# TrackMate data with custom CSV pattern
gemspa-cli -d /data/TrackMateExports --csv-pattern "*Spots in tracks*.csv" \
  --time-step 0.01 --micron-per-px 0.11 --steps-tracks

Symbol Reference

Symbol Definition Units
tau Time lag (frame_lag x dt) s
MSD(tau) Mean-square displacement µm²
D Diffusion coefficient µm²/s
alpha Anomalous exponent
alpha2 Non-Gaussian parameter
VACF Velocity autocorrelation function
T Tortuosity
A_hull Convex-hull area µm²

Output Directory Structure

<work_dir>/
├── <COND>_<REP>/                     # per-replicate analysis
│   ├── msd_results.csv
│   ├── msd_vs_tau.png
│   └── ...
├── grouped_raw/                      # raw ensemble analysis
│   ├── msd_results.csv
│   └── step_kde/
├── grouped_filtered/                 # filtered ensemble analysis
│   ├── msd_results.csv
│   └── step_kde/
├── grouped_advanced_analysis/        # advanced metrics (default: ON)
│   ├── all_conditions_advanced_metrics.csv
│   ├── figures/
│   └── split_metrics_by_condition/
│       ├── D_fit.csv
│       ├── alpha_fit.csv
│       └── ...
├── brightness_stepsize/              # (--steps-tracks)
│   ├── heatmap_all.png
│   ├── heatmap_<filename>.png
│   └── steps_vs_brightness_*.csv
├── tracks_stepsize_map/              # (--steps-tracks)
│   ├── overlay_all.png
│   ├── overlay_<filename>.png
│   └── tracks_stepsize_combined.pdf
└── comparison/                       # cross-condition comparisons
    └── *.png

Citation

If you use this software, please cite:

Bazley A., Keegan S. et al. GEMspa-CLI (PyPI)


Acknowledgements

Developed by:

  1. Andrew Bazley and Sarah KeeganHolt and Fenyo Labs, Institute for Systems Genetics, NYU Langone Health
  2. David DuranHolt Lab, Institute for Systems Genetics, NYU Langone Health

Original package: gemspa-spt (PyPI)
Primary reference: Keegan et al., bioRxiv 2023.06.26.546612


© 2025 GEMspa Project · MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gemspa_cli-2.0.1-py3-none-any.whl (58.0 kB view details)

Uploaded Python 3

File details

Details for the file gemspa_cli-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: gemspa_cli-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 58.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for gemspa_cli-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0cc4ab0bd514bd9df1b627833be945e794c29604aed7827574420c25033b2e1f
MD5 e6acadff9a230c33609aafcdc91fd591
BLAKE2b-256 4d14395e3ffc4e674701dee87a34e2f51ca2e6c486e9f827b1f19c38e82a50d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page