Skip to main content

Toolkit for flexible processing & feature extraction on time-series data

Project description

tsflex

PyPI Latest Release Conda Latest Release support-version codecov Code quality License: MIT Downloads PRs Welcome Documentation Testing

tsflex is a toolkit for flexible time series processing & feature extraction, making few assumptions about input data.

Useful links

Installation

If you are using pip, just execute the following command:

pip install tsflex

Or, if you are using conda, then execute this command:

conda install -c conda-forge tsflex

Usage

tsflex is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!

Feature extraction

import pandas as pd; import numpy as np; import scipy.stats as ss
from tsflex.features import MultipleFeatureDescriptors, FeatureCollection

# 1. -------- Get your time-indexed data --------
url = "https://github.com/predict-idlab/tsflex/raw/main/examples/data/empatica/"
# Contains 1 column; ["TMP"] - 4 Hz sampling rate
data_tmp = pd.read_parquet(url+"tmp.parquet").set_index("timestamp")
# Contains 3 columns; ["ACC_x", "ACC_y", "ACC_z"] - 32 Hz sampling rate
data_acc = pd.read_parquet(url+"acc.parquet").set_index("timestamp")

# 2. -------- Construct your feature collection --------
fc = FeatureCollection(
    MultipleFeatureDescriptors(
          functions=[np.min, np.max, np.mean, np.std, np.median, ss.skew, ss.kurtosis],
          series_names=["TMP", "ACC_x", "ACC_y"], # Use 3 multimodal signals 
          windows=["5min", "7.5min"],  # Use 5 minutes and 7.5 minutes 
          strides="2.5min",  # With steps of 2.5 minutes
    )
)

# 3. -------- Calculate features --------
fc.calculate(data=[data_tmp, data_acc])

More examples

For processing look here
Other examples can be found here

Why tsflex? ✨

  • flexible;
  • efficient view-based operations
    => extremely low memory peak & fast execution times (see benchmarks)
  • maintains the time-index of the data
  • makes little to no assumptions about the time series data

¹ These integrations are shown in integration-example notebooks.

Future work 🔨

  • scikit-learn integration for both processing and feature extraction
    note: is actively developed upon sklearn integration branch.
  • support time series segmentation (exposing under the hood strided-rolling functionality)
    note: see more here.
  • support for multi-indexed dataframes

Contributing 👪

We are thrilled to see your contributions to further enhance tsflex.
See this guide for more instructions on how to contribute.

Referencing our package

If you use tsflex in a scientific publication, we would highly appreciate citing us as:

@article{vanderdonckt2021tsflex,
    author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
    title = {tsflex: flexible time series processing \& feature extraction},
    journal = {SoftwareX},
    year = {2021},
    url = {https://github.com/predict-idlab/tsflex},
    publisher={Elsevier}
}

👤 Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsflex-0.1.2.3.tar.gz (35.3 kB view hashes)

Uploaded Source

Built Distribution

tsflex-0.1.2.3-py3-none-any.whl (42.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page