Toolkit for flexible processing & feature extraction on time-series data
Project description
tsflex is a toolkit for flexible time series processing & feature extraction, making few assumptions about input data.
Useful links
Installation
If you are using pip, just execute the following command:
pip install tsflex
Or, if you are using conda, then execute this command:
conda install -c conda-forge tsflex
Usage
tsflex is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!
Feature extraction
import pandas as pd; import numpy as np; import scipy.stats as ss
from tsflex.features import MultipleFeatureDescriptors, FeatureCollection
# 1. -------- Get your time-indexed data --------
url = "https://github.com/predict-idlab/tsflex/raw/main/examples/data/empatica/"
# Contains 1 column; ["TMP"] - 4 Hz sampling rate
data_tmp = pd.read_parquet(url+"tmp.parquet").set_index("timestamp")
# Contains 3 columns; ["ACC_x", "ACC_y", "ACC_z"] - 32 Hz sampling rate
data_acc = pd.read_parquet(url+"acc.parquet").set_index("timestamp")
# 2. -------- Construct your feature collection --------
fc = FeatureCollection(
MultipleFeatureDescriptors(
functions=[np.min, np.max, np.mean, np.std, np.median, ss.skew, ss.kurtosis],
series_names=["TMP", "ACC_x", "ACC_y"], # Use 3 multimodal signals
windows=["5min", "7.5min"], # Use 5 minutes and 7.5 minutes
strides="2.5min", # With steps of 2.5 minutes
)
)
# 3. -------- Calculate features --------
fc.calculate(data=[data_tmp, data_acc])
More examples
For processing look here
Other examples can be found here
Why tsflex? ✨
- flexible;
- handles multivariate/multimodal time series
- versatile function support
=> integrates natively with many packages for processing (e.g., scipy.signal, statsmodels.tsa) & feature extraction (e.g., numpy, scipy.stats, seglearn¹, tsfresh¹, tsfel¹) - feature-extraction handles multiple strides & window sizes
- efficient view-based operations
=> extremely low memory peak & fast execution times (see benchmarks) - maintains the time-index of the data
- makes little to no assumptions about the time series data
¹ These integrations are shown in integration-example notebooks.
Future work 🔨
- scikit-learn integration for both processing and feature extraction
note: is actively developed upon sklearn integration branch. - support time series segmentation (exposing under the hood strided-rolling functionality)
note: see more here. - support for multi-indexed dataframes
Contributing 👪
We are thrilled to see your contributions to further enhance tsflex
.
See this guide for more instructions on how to contribute.
Referencing our package
If you use tsflex
in a scientific publication, we would highly appreciate citing us as:
@article{vanderdonckt2021tsflex,
author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
title = {tsflex: flexible time series processing \& feature extraction},
journal = {SoftwareX},
year = {2021},
url = {https://github.com/predict-idlab/tsflex},
publisher={Elsevier}
}
👤 Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.