Package for selecting the relevant and non-redundant channels for multivariate time series classification.
Project description
TSelect
Installation
Option 1: Pip install
TSelect can be installed with pip.
pip install tselect
Option 2: Clone repository
Alternatively, the repository can be cloned with:
git clone https://github.com/ML-KULeuven/TSelect.git
Afterward, the requirements should be installed:
pip install -r requirements.txt
Known issues
On Windows, the installation of the pycatch22 package can fail. Installing the package with the following command usually fixes this.
pip install pycatch22==0.4.2 --use-deprecated=legacy-resolver
Quick start
TSelect is a package for selecting relevant and non-redundant channels from multivariate time series data (n instances, t timepoints, d channels). It accepts the following data formats as input:
- MultiIndex Pandas DataFrame (with index levels: (n, t) and d columns)
- 3D NumPy array (with shape: (n, d, t))
- a Dictionary with TSFuse Collection objects (see https://github.com/arnedb/tsfuse for more information)
The general set-up is as follows:
from tselect.channel_selectors.tselect import TSelect
# Load your data, split in train and test set, etc.
x_train, x_test = ...
y_train, y_test = ...
channel_selector = TSelect(irrelevant_percentage_to_keep=0.6,
redundant_correlation_threshold=0.7)
channel_selector.fit(x_train, y_train)
x_train_selected = channel_selector.transform(x_train)
x_test_selected = channel_selector.transform(x_test)
clf = <some MTSC classifier> # Can be any classifier for multivariate time series classification
clf.fit(x_train_selected, y_train)
y_pred = clf.predict(x_test_selected)
Hyperparameters
TSelect has several hyperparameters that can be adapted to the specific dataset and use case.
The hyperparameters to configure the irrelevant channel selector:
irrelevant_selector: bool, default=True- Whether to use the irrelevant channel selector.
irrelevant_percentage_to_keep: float, default=0.6- The percentage of channels that are expected to be relevant. TSelect will keep this percentage of channels after the irrelevant channel selector step.
- A value between 0 and 1, where 1 means all channels are kept.
irrelevant_hard_threshold: float, default=0.5- All channels with an evaluation metric (e.g. ROCAUC) below this threshold are considered worse than random and are removed, unless this would remove all channels.
The hyperparameters to configure the redundant channel selector:
redundant_selector: bool, default=True- Whether to use the redundant channel selector.
redundant_correlation_threshold: float, default=0.7- The correlation threshold to use for the redundant channel selector step. Channels that make predictions with a correlation higher than this threshold are considered redundant.
- A value between 0 and 1, where 1 means that the predictions have to be identical.
Other hyperparameters:
validation_size: float, default=None- The size of the validation set used to compute the evaluation metric. If None, the validation size is derived from max(100, 0.25*nb_instances). The train set then includes the remaining instances.
random_state: int, default=0- The random state to use for reproducibility.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tselect-1.0.0.tar.gz.
File metadata
- Download URL: tselect-1.0.0.tar.gz
- Upload date:
- Size: 67.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.8.10 Linux/6.8.0-84-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d256acd6f142d618dc5b2c16bbfcd7e0e855349bd36e60bbf708adf8519e4dea
|
|
| MD5 |
62b574f0d18d09f523740f7d9b156b9f
|
|
| BLAKE2b-256 |
c46dc33eea36be4c20d092180f06540707c667468678f5cc2ce6371f1458eb86
|
File details
Details for the file tselect-1.0.0-cp312-cp312-manylinux_2_39_x86_64.whl.
File metadata
- Download URL: tselect-1.0.0-cp312-cp312-manylinux_2_39_x86_64.whl
- Upload date:
- Size: 87.6 kB
- Tags: CPython 3.12, manylinux: glibc 2.39+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.8.10 Linux/6.8.0-84-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1407ccc3949cd62c70b143f6b02187f3dd444549a8dacd5f8d8c9540e2e27fa
|
|
| MD5 |
0b5868aa79704378a2dff378ddacb1f2
|
|
| BLAKE2b-256 |
2c99c70d05d00fa38d1db3a6647d04365bb1536d15b0d1d59bb5851be70edd8e
|