GEMspa: Advanced single-particle tracking analysis pipeline with MSD, diffusion analysis, and visualization tools
Project description
GEMspa-CLI: Single-Particle Tracking Analysis (v2.0.0)
Advanced, modular single-particle tracking (SPT) and ensemble diffusion analysis for microscopy data.
Overview
gemspa-cli is a command-line interface for GEMspa, a modular single-particle tracking and diffusion analysis suite. It performs trajectory extraction, per-track MSD fitting, ensemble averaging, step-size statistics, and condition-wise comparisons from single-molecule tracking data.
New in v2.0.0
- Condition-grouped metrics — automatically generates split CSV files per condition
- Steps and tracks analysis — step-size vs. brightness heatmaps and colored track overlays
- Flexible condition extraction — strips date codes and handles arbitrary dataset naming conventions
- Unified filtering system — consistent D and α filtering across all pipeline stages
- Enhanced CLI — new flags for advanced visualization and analysis
Features
- Advanced group analysis with automatic metrics splitting by condition
- TrackMate cleaning utility (
--clean-trackmate) - Unified global filtering of D and α applied consistently across the entire pipeline
- Step-size vs. brightness heatmaps with customizable binning and color scaling
- Track overlays colored by step size for visual quality control
- Cross-condition comparison plots with KS-test statistical annotations
Installation
From PyPI (recommended)
pip install GEMspa-CLI
Virtual environment (optional but recommended)
python3 -m venv ~/venvs/gemspa
source ~/venvs/gemspa/bin/activate
Windows PowerShell:
python -m venv %USERPROFILE%\venvs\gemspa
%USERPROFILE%\venvs\gemspa\Scripts\Activate.ps1
Input File Formats
GEMspa-CLI accepts trajectory data in CSV format.
TrackMate export format
- File pattern:
*Spots in tracks*.csv(pass with--csv-pattern "*Spots in tracks*.csv") - Required columns:
POSITION_X,POSITION_Y,FRAME,TRACK_ID - Optional columns:
MEAN_INTENSITY_CH1(for brightness analysis)
GEMspa format
- File pattern:
Traj_*.csv(default) - Required columns:
x,y,frame,track_id - Optional columns:
brightness(for step-size analysis)
Naming conventions
GEMspa-CLI automatically strips date codes and replicate suffixes to extract condition labels:
Traj_20220706_G12V_4.csv→ condition:G12VTraj_20220708_G12V_13.csv→ condition:G12V(correctly pooled with the above)Traj_20220706_HKWT_2.csv→ condition:HKWT
Pipeline Stages
1. Data discovery and preparation
- Scans the working directory for trajectory CSV files
- Optionally cleans TrackMate exports to GEMspa column format
- Groups files by condition, removing date codes automatically
2. Per-replicate analysis
- Loads each trajectory file and computes per-track MSD curves
- Fits diffusion parameters (D, α, r²) for every track
- Writes per-replicate result tables and diagnostic plots
3. Ensemble analysis
- Pools tracks by condition (raw and filtered variants)
- Computes ensemble-averaged MSD curves per condition
- Applies global filter parameters consistently
4. Advanced group analysis (default: ON)
- Computes VACF, confinement index, convex-hull area, and tortuosity
- Splits all metrics by condition into separate CSV files
- Generates comprehensive condition-level visualizations
5. Steps and tracks analysis (optional: --steps-tracks)
- Produces step-size vs. brightness heatmaps
- Generates track overlays colored by step size
- Supports both per-file and pooled outputs
6. Cross-condition comparison
- KS tests and boxplots comparing conditions
- Publication-ready figures saved to
comparison/
Workflow Overview
SMT data export --> gemspa-cli pipeline
|
+-- (optional) --clean-trackmate
| --> standardized Traj_*.csv
|
+-- Per-replicate trajectory analysis
| --> D_fit, alpha_fit, r2, MSD curves, rainbow overlay
|
+-- Ensemble pooling by condition
| --> grouped_raw/ and grouped_filtered/
|
+-- (optional) --step-size-analysis
| --> KDEs, non-Gaussian (alpha2) stats, KS tests
|
+-- (optional) --steps-tracks
| --> brightness_stepsize/ (heatmaps)
| --> tracks_stepsize_map/ (track overlays)
|
+-- Advanced group analysis (default: ON)
| --> grouped_advanced_analysis/
| --> split_metrics_by_condition/
|
+-- Cross-condition comparison
--> comparison/*.png
Core Analysis
1. Mean-square displacement (MSD)
For a trajectory of N frames, the time-averaged MSD at lag tau is:
MSD(tau) = < (x[i+tau] - x[i])^2 + (y[i+tau] - y[i])^2 >_i
where tau = frame_lag x dt (set by --time-step), and the average is taken over all valid frame pairs i.
2. Diffusion coefficient (D)
Estimated from a linear fit to the early MSD regime:
MSD(tau) ~ 4 * D * tau => D = (1/4) * d(MSD)/d(tau)
Units: µm²/s
3. Anomalous exponent (alpha)
Extracted from the log-log slope of MSD vs. tau:
log10[ MSD(tau) ] = alpha * log10(tau) + log10(4D)
- alpha ≈ 1: normal (Brownian) diffusion
- alpha < 1: subdiffusive (confined or hindered)
- alpha > 1: superdiffusive (directed or active)
4. Non-Gaussian parameter (alpha2)
Quantifies deviation from Gaussian (Brownian) step-size statistics:
alpha2 = <r^4> / (3 * <r^2>^2) - 1
alpha2 = 0 for a Gaussian distribution; positive values indicate heterogeneous or anomalous dynamics.
5. Velocity autocorrelation function (VACF)
Used in advanced group analysis to assess directional persistence:
VACF(k) = < v_i . v_{i+k} > / < v_i . v_i >
where v_i is the displacement vector at step i and k is the lag in frames.
Usage
Graphical interface
gemspa-gui
The GUI exposes all CLI parameters through organized input sections, includes real-time validation, and supports saving and loading parameter sets for reproducible analysis.
Command-line interface
gemspa-cli -d /path/to/folder [options]
Required
-d, --work-dir— directory containing trajectory CSV files
Input discovery
--csv-pattern— glob for input CSVs (default:Traj_*.csv); for TrackMate use"*Spots in tracks*.csv"
Acquisition and units
--time-step— seconds per frame--micron-per-px— pixel size in µm
Track and fit constraints
--min-track-len— minimum frames per track--tlag-cutoff— maximum lag in frames used for MSD fitting
Parallelism
-j, --n-jobs— parallel processes across replicates--threads-per-rep— threads per replicate
Rainbow track overlays (optional)
--rainbow-tracks— enable D-colored track overlays--img-prefix— prefix for background TIFF images (e.g.,MAX_)--rainbow-min-D,--rainbow-max-D— D range for colormap--rainbow-colormap,--rainbow-scale,--rainbow-dpi
Ensemble filters (applied globally)
--filter-D-min,--filter-D-max— D bounds (µm²/s)--filter-alpha-min,--filter-alpha-max— alpha bounds
Optional analyses
--step-size-analysis— step-size KDE and KS plots--clean-trackmate— run TrackMate CSV cleaner and exit--no-advanced-group— skip the advanced group analysis stage
Steps and tracks analysis
--steps-tracks— enable step-size vs. brightness heatmaps and track overlays--steps-tracks-mode {both,heatmaps,tracks}— what to generate (default:both)--stepsize-max— max step size for plots and LUT in pixels (default: 3.0)--bins-x,--bins-y— heatmap bin counts (default: 150 each)--count-cap— count cap for heatmap color scale (default: 300)--line-width— line width for track overlays (default: 0.7)--min-track-length— minimum track length for overlays (default: 10)--brightness-col— brightness column name (default:MEAN_INTENSITY_CH1)--invert-lut-tracks— invert the colormap for track overlays--strip-datecodes— strip date codes from output filenames (default: true)
Outputs
Per replicate (<COND>_<REP>/)
msd_results.csv— per-track D_fit, alpha_fit, r2_fitmsd_vs_tau.png— linear MSD vs. tau with D estimatemsd_vs_tau_loglog.png— log-log MSD vs. tau with alpha slopeD_fit_distribution.png— histogram of D (log x-axis)alpha_vs_logD.png— scatter of alpha vs. log10(D)rainbow_tracks.png— D-colored trajectory overlay (if enabled)
Ensemble level
grouped_raw/andgrouped_filtered/subdirectories- Ensemble-averaged MSD plots (
ensemble_msd_vs_tau_<COND>.png) - Step-size KDEs (
step_kde_<COND>_(ensemble).pngand filtered variants)
Comparison (comparison/)
ensemble_filtered_D_histograms.png— log-scale D distributions with KS annotationensemble_filtered_alpha_histograms.png— alpha distributions with KS annotationreplicate_median_D_boxplot.png— per-replicate median D by condition
Advanced group analysis (grouped_advanced_analysis/)
Runs automatically unless --no-advanced-group is specified.
Per-track metrics table columns:
track_id, condition, D_fit, alpha_fit, r2_fit, vacf_lag1,
confinement_idx, hull_area_um2, tortuosity, n_frames
Plots: D_fit and alpha_fit box/violin plots by condition, VACF histograms and mean curves, convex-hull area vs. tortuosity scatterplots.
Split metrics (split_metrics_by_condition/) — one CSV per metric, conditions as columns:
D_fit.csv,alpha_fit.csv,r2_fit.csvvacf_lag1.csv,confinement_idx.csvhull_area_um2.csv,tortuosity.csv,n_frames.csv_index.csv— parameter-to-file mapping
Steps and tracks analysis
Generated when --steps-tracks is specified.
Heatmaps (brightness_stepsize/):
heatmap_all.png,heatmap_<filename>.pngsteps_vs_brightness_all.csv,steps_vs_brightness_<filename>.csv
Track overlays (tracks_stepsize_map/):
overlay_all.png,overlay_<filename>.pngtracks_stepsize_combined.pdf
TrackMate cleaner (--clean-trackmate)
Normalizes TrackMate exports to GEMspa column format (x, y, frame, track_id), writing cleaned files in place while preserving original filenames.
Options:
--clean-out-dir— write outputs to a separate directory--clean-include-date— include date codes in output names (YYMMDD or YYYYMMDD)--clean-move— move rather than copy when renaming legacyTraj_*files--clean-dry-run— preview actions without writing any files
Example Commands
# Clean TrackMate CSVs only
gemspa-cli -d /data/TrackMateExports --clean-trackmate
# Standard run with physical units
gemspa-cli -d /data/GEMspa --time-step 0.03 --micron-per-px 0.11 \
--min-track-len 4 --tlag-cutoff 4
# Add step-size KDEs and rainbow overlays
gemspa-cli -d /data/GEMspa --rainbow-tracks --step-size-analysis
# Steps and tracks analysis with default settings
gemspa-cli -d /data/GEMspa --steps-tracks --time-step 0.01 --micron-per-px 0.11
# Heatmaps only with custom binning
gemspa-cli -d /data/GEMspa --steps-tracks --steps-tracks-mode heatmaps \
--stepsize-max 5.0 --bins-x 200 --bins-y 200 --count-cap 500
# Track overlays only with inverted colormap
gemspa-cli -d /data/GEMspa --steps-tracks --steps-tracks-mode tracks \
--line-width 1.0 --invert-lut-tracks --min-track-length 15
# Skip advanced analysis
gemspa-cli -d /data/GEMspa --no-advanced-group
# TrackMate data with custom CSV pattern
gemspa-cli -d /data/TrackMateExports --csv-pattern "*Spots in tracks*.csv" \
--time-step 0.01 --micron-per-px 0.11 --steps-tracks
Symbol Reference
| Symbol | Definition | Units |
|---|---|---|
| tau | Time lag (frame_lag x dt) | s |
| MSD(tau) | Mean-square displacement | µm² |
| D | Diffusion coefficient | µm²/s |
| alpha | Anomalous exponent | — |
| alpha2 | Non-Gaussian parameter | — |
| VACF | Velocity autocorrelation function | — |
| T | Tortuosity | — |
| A_hull | Convex-hull area | µm² |
Output Directory Structure
<work_dir>/
├── <COND>_<REP>/ # per-replicate analysis
│ ├── msd_results.csv
│ ├── msd_vs_tau.png
│ └── ...
├── grouped_raw/ # raw ensemble analysis
│ ├── msd_results.csv
│ └── step_kde/
├── grouped_filtered/ # filtered ensemble analysis
│ ├── msd_results.csv
│ └── step_kde/
├── grouped_advanced_analysis/ # advanced metrics (default: ON)
│ ├── all_conditions_advanced_metrics.csv
│ ├── figures/
│ └── split_metrics_by_condition/
│ ├── D_fit.csv
│ ├── alpha_fit.csv
│ └── ...
├── brightness_stepsize/ # (--steps-tracks)
│ ├── heatmap_all.png
│ ├── heatmap_<filename>.png
│ └── steps_vs_brightness_*.csv
├── tracks_stepsize_map/ # (--steps-tracks)
│ ├── overlay_all.png
│ ├── overlay_<filename>.png
│ └── tracks_stepsize_combined.pdf
└── comparison/ # cross-condition comparisons
└── *.png
Citation
If you use this software, please cite:
Bazley A., Keegan S. et al. GEMspa-CLI (PyPI)
Acknowledgements
Developed by:
- Andrew Bazley and Sarah Keegan — Holt and Fenyo Labs, Institute for Systems Genetics, NYU Langone Health
- David Duran — Holt Lab, Institute for Systems Genetics, NYU Langone Health
Original package: gemspa-spt (PyPI)
Primary reference: Keegan et al., bioRxiv 2023.06.26.546612
© 2025 GEMspa Project · MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gemspa_cli-2.0.2-py3-none-any.whl.
File metadata
- Download URL: gemspa_cli-2.0.2-py3-none-any.whl
- Upload date:
- Size: 58.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
857c4491822b79f81e4f960780bf5cbf0b1070c718d54822d80b2b1f34454506
|
|
| MD5 |
22de4ea6fb8ee5ef822f59287915d7e0
|
|
| BLAKE2b-256 |
5fce22ecd9c26d14effdbf1dda1db10f6008022f0bd6e9c596fd453f7b2de6ac
|