Skip to main content

A package for unsupervised representation and principal component analysis of irregularly sampled time series with variable size relying on the shape analysis literature.

Project description

PCA for time series

Authors: Samuel Gruffaz, Thibaut Germain

This repository gathers the functions developed in the paper “Shape Analysis for Time Series”, located in the TS_PCA directory.

It is possible to represent irregularly sampled time series of different lengths and to apply kernel PCA to these representations in order to identify the main modes of shape variation in the time series.

TS-LDDMM Scheme

Time series graphs $(\mathsf{G}_i)_{i\in[5]}$ are represented as the deformations of a time series graph of reference $\mathsf{G}_0$ by transformations $(\chi_{\alpha_i})_{i\in[5]} $ parameterized by $(\alpha_i)_{i\in[5]}$.

These methods work particularly well when the analyzed dataset is homogeneous in terms of shapes, for example when each time series corresponds to:

  • a heartbeat recording,
  • a respiratory cycle,
  • an electricity consumption pattern,
  • or a heating load curve.

Dataset format

The main requirement is to represent the time series dataset as a collection of time series graphs. Each time series graph should be an array T of shape (n_samples, d+1), where T[:, 0] contains the time points, and T[:, 1:] contains the time series values of dimension d.

The full dataset should be an array of fixed shape (n_time_series, n_samples_max, d+1) along with a corresponding mask of shape (n_time_series, n_samples_max, 1), where n_samples_max is the maximum number of samples among all time series. This accommodates the fact that each time series may have a different number of samples.

Default parameters work well when the distance between two consecutive time points is approximately 1.

TS-PCA: Basic Usage Example

This example demonstrates the basic workflow of using the TS-PCA package to analyze time-series data using TS-LDDMM representations and Kernel PCA.

# Import or generate a toy dataset,
N = 8
dataset, dataset_mask, graph_ref, graph_ref_mask = generate_easy_dataset(N=N)

#dataset is an array of shape (8,200,2) and dataset mask an array of shape (8,200,1)

# Initialize the TS-PCA class
class_test = TS_PCA_()

# Step 1: Fit TS-LDDMM representations
# This learns the temporal-shape embeddings of the dataset.
# Set learning_graph_ref=True to learn the reference graph; here we keep it fixed.
class_test.fit_TS_LDDMM_representations(
    dataset,
    dataset_mask,
    learning_graph_ref=False,
    graph_ref=graph_ref,
    graph_ref_mask=graph_ref_mask
)

# Step 2: Fit Kernel PCA on the learned representations
class_test.fit_kernel_PCA()

# Step 3: Visualize the principal components
class_test.plot_components()

Example of principal component deformation

After applying Kernel PCA to the TS-LDDMM features $(\alpha_j)_{j \in [N]}$ extracted from a dataset of mouse respiratory cycles under drug exposure, we visualize the deformations $\chi_\alpha \cdot \mathsf{G}_0$ of the reference time series graph $\mathsf{G}_0$ as $\alpha$ varies along the principal component $PC_0$. Notably, $\alpha=- 1.5 \sigma \times PC_0$ captures the deformation accounting for the effect of the drug on the respiratory cycle.

The Docs directory contains the files used to build the package documentation.

The pages directory contains the pages used to launch a Streamlit application from the menu, allowing users to test the different building blocks of the code.

Coming next:

  • Complete documentation
  • New kernels

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ts_pca-0.0.11.tar.gz (28.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ts_pca-0.0.11-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file ts_pca-0.0.11.tar.gz.

File metadata

  • Download URL: ts_pca-0.0.11.tar.gz
  • Upload date:
  • Size: 28.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for ts_pca-0.0.11.tar.gz
Algorithm Hash digest
SHA256 7f8958f4d01faa85a379065c0e6f48750766843f9dac5d8556fa0e82a7238244
MD5 a3ac74cc2184ef7ebc566244080e45af
BLAKE2b-256 68177503e94ce7ed515685ff26884a6a808e148a9e35f74a52ae5b2c2519b5d2

See more details on using hashes here.

File details

Details for the file ts_pca-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: ts_pca-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 30.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for ts_pca-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 31bcc2a805f5d17ca1f6bf7f5d4fc730277c21a1a00bea663fdf1732f7b28c2e
MD5 76ce0925dd161465f50acea8a82d6832
BLAKE2b-256 cbcbdae590eeb2f8a99462f3b29fa02c5f81a18719b5eae69f9479cb1c6a1594

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page