Toolkit for flexible processing & feature extraction on time-series data
Project description
tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data.
Useful links
Installation
command | |
---|---|
pip | pip install tsflex |
conda | conda install -c conda-forge tsflex |
Usage
tsflex is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!
Feature extraction
import pandas as pd; import numpy as np; import scipy.stats as ss
from tsflex.features import MultipleFeatureDescriptors, FeatureCollection
from tsflex.utils.data import load_empatica_data
# 1. Load sequence-indexed data (in this case a time-index)
df_tmp, df_acc, df_ibi = load_empatica_data(['tmp', 'acc', 'ibi'])
# 2. Construct your feature extraction configuration
fc = FeatureCollection(
MultipleFeatureDescriptors(
functions=[np.min, np.mean, np.std, ss.skew, ss.kurtosis],
series_names=["TMP", "ACC_x", "ACC_y", "IBI"],
windows=["15min", "30min"],
strides="15min",
)
)
# 3. Extract features
fc.calculate(data=[df_tmp, df_acc, df_ibi], approve_sparsity=True)
Note that the feature extraction is performed on multivariate data with varying sample rates.
signal | columns | sample rate |
---|---|---|
df_tmp | ["TMP"] | 4Hz |
df_acc | ["ACC_x", "ACC_y", "ACC_z" ] | 32Hz |
df_ibi | ["IBI"] | irregularly sampled |
Processing
Why tsflex? ✨
Flexible
:- handles multivariate/multimodal time series
- versatile function support
=> integrates with many packages for:
- processing (e.g., scipy.signal, statsmodels.tsa)
- feature extraction (e.g., numpy, scipy.stats, antropy, nolds, seglearn¹, tsfresh¹, tsfel¹)
- feature extraction handles multiple strides & window sizes
Efficient
:
- view-based operations for processing & feature extraction => extremely low memory peak & fast execution time
- view-based operations for processing & feature extraction => extremely low memory peak & fast execution time
Intuitive
:
- maintains the sequence-index of the data
- feature extraction constructs interpretable output column names
- intuitive API
Few assumptions
about the sequence data:- no assumptions about sampling rate
- able to deal with multivariate asynchronous data
i.e. data with small time-offsets between the modalities
Advanced functionalities
:- apply FeatureCollection.reduce after feature selection for faster inference
- use function execution time logging to discover processing and feature extraction bottlenecks
- embedded SeriesPipeline & FeatureCollection serialization
- time series chunking
¹ These integrations are shown in integration-example notebooks.
Future work 🔨
- scikit-learn integration for both processing and feature extraction
note: is actively developed upon sklearn integration branch. - Support time series segmentation (exposing under the hood strided-rolling functionality) - see this issue
- Support for multi-indexed dataframes
=> Also see the enhancement issues
Contributing 👪
We are thrilled to see your contributions to further enhance tsflex
.
See this guide for more instructions on how to contribute.
Referencing our package
If you use tsflex
in a scientific publication, we would highly appreciate citing us as:
@article{vanderdonckt2021tsflex,
author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
title = {tsflex: flexible time series processing \& feature extraction},
journal = {SoftwareX},
year = {2021},
url = {https://github.com/predict-idlab/tsflex},
publisher={Elsevier}
}
Link to the paper: https://www.sciencedirect.com/science/article/pii/S2352711021001904
👤 Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tsflex-0.4.1.tar.gz
.
File metadata
- Download URL: tsflex-0.4.1.tar.gz
- Upload date:
- Size: 59.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.10.6 Linux/5.15.0-119-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce5d5db6725a5c5748e543836acdf2a5187e0777a100dc637765ad4d4fa67988 |
|
MD5 | 37f0a99949427aa69c7fb9a130b9f183 |
|
BLAKE2b-256 | a0fa67d34578971b50b2465b5cb9870c0ab5beb27e39a7de66337066670ba721 |
File details
Details for the file tsflex-0.4.1-py3-none-any.whl
.
File metadata
- Download URL: tsflex-0.4.1-py3-none-any.whl
- Upload date:
- Size: 67.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.10.6 Linux/5.15.0-119-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6606ec13ca4757bad6582ab2ded0f3743041c4a495b0686187f0707aa2dd6c5 |
|
MD5 | 3301975ed678fab5d8c2e6663b42b68d |
|
BLAKE2b-256 | 16c2ab19fb54b574712476e0f5ce755b6b2987c974a4f831396118d307ccdfed |