Skip to main content

Package for selecting the relevant and non-redundant channels for multivariate time series classification.

Project description

TSelect

Installation

Option 1: Pip install

TSelect can be installed with pip.

pip install tselect

Option 2: Clone repository

Alternatively, the repository can be cloned with:

git clone https://github.com/ML-KULeuven/TSelect.git

Afterward, the requirements should be installed:

pip install -r requirements.txt

Known issues

On Windows, the installation of the pycatch22 package can fail. Installing the package with the following command usually fixes this.

pip install pycatch22==0.4.2 --use-deprecated=legacy-resolver

Quick start

TSelect is a package for selecting relevant and non-redundant channels from multivariate time series data (n instances, t timepoints, d channels). It accepts the following data formats as input:

  • MultiIndex Pandas DataFrame (with index levels: (n, t) and d columns)
  • 3D NumPy array (with shape: (n, d, t))
  • a Dictionary with TSFuse Collection objects (see https://github.com/arnedb/tsfuse for more information)

The general set-up is as follows:

from tselect.channel_selectors.tselect import TSelect

# Load your data, split in train and test set, etc.
x_train, x_test = ... 
y_train, y_test = ...

channel_selector = TSelect(irrelevant_percentage_to_keep=0.6,
                           redundant_correlation_threshold=0.7)
channel_selector.fit(x_train, y_train)
x_train_selected = channel_selector.transform(x_train)
x_test_selected = channel_selector.transform(x_test)

clf = <some MTSC classifier> # Can be any classifier for multivariate time series classification
clf.fit(x_train_selected, y_train)
y_pred = clf.predict(x_test_selected)

Hyperparameters

TSelect has several hyperparameters that can be adapted to the specific dataset and use case.

The hyperparameters to configure the irrelevant channel selector:

  • irrelevant_selector: bool, default=True
    • Whether to use the irrelevant channel selector.
  • irrelevant_percentage_to_keep: float, default=0.6
    • The percentage of channels that are expected to be relevant. TSelect will keep this percentage of channels after the irrelevant channel selector step.
    • A value between 0 and 1, where 1 means all channels are kept.
  • irrelevant_hard_threshold: float, default=0.5
    • All channels with an evaluation metric (e.g. ROCAUC) below this threshold are considered worse than random and are removed, unless this would remove all channels.

The hyperparameters to configure the redundant channel selector:

  • redundant_selector: bool, default=True
    • Whether to use the redundant channel selector.
  • redundant_correlation_threshold: float, default=0.7
    • The correlation threshold to use for the redundant channel selector step. Channels that make predictions with a correlation higher than this threshold are considered redundant.
    • A value between 0 and 1, where 1 means that the predictions have to be identical.

Other hyperparameters:

  • validation_size: float, default=None
    • The size of the validation set used to compute the evaluation metric. If None, the validation size is derived from max(100, 0.25*nb_instances). The train set then includes the remaining instances.
  • random_state: int, default=0
    • The random state to use for reproducibility.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tselect-1.0.0.tar.gz (67.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tselect-1.0.0-cp312-cp312-manylinux_2_39_x86_64.whl (87.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

File details

Details for the file tselect-1.0.0.tar.gz.

File metadata

  • Download URL: tselect-1.0.0.tar.gz
  • Upload date:
  • Size: 67.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.8.10 Linux/6.8.0-84-generic

File hashes

Hashes for tselect-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d256acd6f142d618dc5b2c16bbfcd7e0e855349bd36e60bbf708adf8519e4dea
MD5 62b574f0d18d09f523740f7d9b156b9f
BLAKE2b-256 c46dc33eea36be4c20d092180f06540707c667468678f5cc2ce6371f1458eb86

See more details on using hashes here.

File details

Details for the file tselect-1.0.0-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

  • Download URL: tselect-1.0.0-cp312-cp312-manylinux_2_39_x86_64.whl
  • Upload date:
  • Size: 87.6 kB
  • Tags: CPython 3.12, manylinux: glibc 2.39+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.8.10 Linux/6.8.0-84-generic

File hashes

Hashes for tselect-1.0.0-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 d1407ccc3949cd62c70b143f6b02187f3dd444549a8dacd5f8d8c9540e2e27fa
MD5 0b5868aa79704378a2dff378ddacb1f2
BLAKE2b-256 2c99c70d05d00fa38d1db3a6647d04365bb1536d15b0d1d59bb5851be70edd8e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page