A framework for clustering longitudinal data -- Python port of the R package latrend.
Project description
latrend (Python)
Python port of the R package latrend for longitudinal trajectory clustering.
This project is an explicit effort to give Python users the core functionality and workflow of the R latrend package, while keeping a familiar Python API.
latrend provides a standardised framework to cluster longitudinal (trajectory) data.
The name is short for latent-class trend analysis. This Python port reproduces the
core pipeline, plotting theme, and API conventions of the upstream R package so that
analyses are interchangeable between languages.
Installation
# Install from PyPI (recommended for users)
pip install latrend
# Install latest from GitHub (before/without PyPI release)
pip install "git+https://github.com/s-rani1/latrend-py.git"
# Editable install (development)
pip install -e ".[dev,plot]"
Quickstart
import latrend as lt
# Built-in demo dataset (mirrors R's data(latrendData))
data = lt.latrendData()
# Or generate synthetic trajectories
data = lt.generateLongData(nIndividuals=200, nClusters=3, seed=1)
# Cluster with Linear-Mixed K-Means
method = lt.lcMethodLMKM(formula="Y ~ Time", nClusters=3, seed=1)
model = lt.latrendCluster(method, data)
# Visualise cluster trajectories (ggplot2-style if plotnine installed)
p = lt.plotClusterTrajectories(model, ci=True)
# Save the plot
try:
p.save("cluster_trajectories.png", dpi=150) # plotnine
except AttributeError:
p.figure.savefig("cluster_trajectories.png", dpi=150) # matplotlib
Features
Clustering methods
| Method | Class | Description |
|---|---|---|
| Random baseline | lcMethodRandom |
Assigns trajectories to clusters uniformly at random |
| KML-style | lcMethodKML |
KMeans clustering on trajectory vectors (kml_fast/kml_strict) |
| Linear-mixed K-means | lcMethodLMKM |
Per-individual linear regression + KMeans on coefficients |
| Feature-based | lcMethodFeatures |
20+ trajectory features + KMeans |
| R backend (any) | lcMethodR / dynamic lcMethod* |
Delegates to the upstream R package via rpy2 |
Pipeline
# Single model
model = lt.latrendCluster(method, data)
# Batch: sweep over k = 1..6
models = lt.latrendBatchCluster(method, data, nClusters=range(1, 7))
# Repeated runs (different seeds) for stability
models = lt.latrendRepCluster(method, data, nRep=10)
# Model selection
best = models.bestModel(key="silhouette", maximize=True)
KML Parity Mode
Use kml_strict to better match R KML behavior via multi-start selection:
method = lt.lcMethodKML(
nClusters=4,
mode="kml_strict", # or: "kml_fast"
nStarts=20,
nInit=100,
maxIter=500,
center=True,
scale=False,
distance="euclidean",
seed=265368763,
)
model = lt.latrendCluster(method, data)
Plotting (R ggplot2-matching theme)
All plots use theme_light() styling and the ggplot2 default discrete colour palette
(#F8766D, #00BA38, #619CFF, ...) so output looks identical to the R package.
lt.plotTrajectories(data) # Spaghetti plot
lt.plotTrajectories(model, facet=True) # Faceted by cluster
lt.plotClusterTrajectories(model, ci=True) # Mean + 95% CI ribbon
lt.plotClusterTrajectories(model, trajectories=True) # With individual overlay
lt.plotMetric(models) # Elbow / silhouette plot
lt.plotClassProportions(model) # Cluster size bar chart
lt.plotClassProbabilities(model) # Posterior histograms
Backends: Uses plotnine (ggplot2-like) when installed; falls back to matplotlib otherwise.
Reproducing R plot(kmlModel4)
from pathlib import Path
import pandas as pd
import latrend as lt
from plotnine import labs
# Example paths (repo-local)
repo = Path(".")
df = pd.read_csv(repo / "tests" / "data" / "latrend_data.csv").drop(
columns=["Unnamed: 0"], errors="ignore"
)
assign = pd.read_csv(repo / "tests" / "data" / "kml_model4_assignments.csv")
# Build LCModel from fixed assignments
clusters = assign.set_index("Id")["Cluster"]
method = lt.LCMethod(id="Id", time="Time", outcome="Y", name="KML")
model = lt.LCModel(method=method, data=df[["Id", "Time", "Y"]], clusters=clusters)
# Equivalent of R's plot(kmlModel4): faceted assigned trajectories + black mean line
p = lt.plotClusterTrajectories(
model,
trajectories=True,
backend="plotnine",
figure_size=(7, 5.8),
base_size=11,
)
p = p + labs(
subtitle="Cluster trajectories for KML model with 4 clusters, along with the assigned trajectories."
)
p.save("docs/images/kml_model4_python_generated.png", dpi=150)
Data utilities
lt.latrendData() # Built-in 200-trajectory dataset
lt.generateLongData(...) # Custom synthetic data
lt.tsmatrix(data) # Long -> wide format
lt.tsframe(wide_matrix) # Wide -> long format
lt.trajectories(method, data) # Per-individual trajectory dict
Reporting
lt.lcModelReport(model, "output/") # Markdown report + PNG plots
Optional R backend
If you have R + the R package latrend installed, any missing lcMethod* constructor
is automatically delegated to R via rpy2:
pip install -e ".[r]"
method = lt.lcMethodLcmmGMM(formula="Y ~ Time", nClusters=3)
model = lt.latrendCluster(method, data) # runs in R
Project structure
latrend_py/
src/latrend/
__init__.py # Public API
core/ # LCMethod, LCModel, pipeline, matrix converters
data/ # Data generation + built-in latrendData
methods/ # lcMethodRandom, lcMethodKML, lcMethodLMKM, lcMethodFeatures, lcMethodR
metrics/ # Silhouette score
plots/ # All plotting functions + theme
backends/ # rpy2-based R integration
report.py # Markdown report generator
tests/
.github/workflows/ # CI (Python 3.9-3.12)
Running tests
pytest -q
Contributing
See CONTRIBUTING.md for development setup and guidelines.
Citation
If you use latrend (Python) in academic work, please cite this repository.
Citation metadata is provided in CITATION.cff (GitHub will expose this via "Cite this repository").
License
GPL-2.0-or-later (aligned with the upstream R package).
Acknowledgements
This package is a Python port of the latrend R package by Niek Den Teuling (Philips Research).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file latrend-0.1.1.tar.gz.
File metadata
- Download URL: latrend-0.1.1.tar.gz
- Upload date:
- Size: 37.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
675422dbb7a043f114fdb1e8d6c2b64204068945921313bd9b29216e44ec42e3
|
|
| MD5 |
c0d0b0215657959fbbfd17aa3c81d4e9
|
|
| BLAKE2b-256 |
46f822a025edbdf2c6bb1f7e58dcaab8ff79a1ccd140723d3dbe9240512a1c6f
|
File details
Details for the file latrend-0.1.1-py3-none-any.whl.
File metadata
- Download URL: latrend-0.1.1-py3-none-any.whl
- Upload date:
- Size: 33.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c0eafbdeaaf292f0a9fa3a8ec086e8881c6feb0ef90093bf3a641d90e99fa25
|
|
| MD5 |
b089047d46891f46f2630cc9fb38ccf3
|
|
| BLAKE2b-256 |
509584187e2613545098fc24a84d9a209aa67d61ae07af988591e444a8b14d8f
|