PyTorch and Lightning compatible library that provides easy and flexible access to various time-series datasets for classification and regression tasks
Project description
torchchronos
torchchronos is an experimental PyTorch and Lightning compatible library that provides easy and flexible access to various time-series datasets for classification and regression tasks. It also provides a simple and extensible transform API to preprocess data. It is inspired by the much more complicated torchtime.
Installation
You can install torchchronos via pip:
pip install torchchronos
Usage
Datasets
torchchronos currently provides access to several popular time-series datasets, including:
- UCR/UEA Time Series Classification Repository:
torchchronos.datasets.UCRUEADataset
- Time series as preprocessed in the TFC paper:
torchchronos.datasets.TFCPretrainDataset
(datasetsGesture
andEMG
)
To use a dataset, you can simply import the corresponding dataset class and create an instance:
from torchchronos.datasets import UCRUEADataset
from torchchronos.transforms import PadFront
from torchchronos.download import download_uea_ucr
download_uea_ucr("ECG5000",Path(".cache/data"))
dataset = UCRUEADataset('ECG5000', path=Path(".cache") / "data", transforms=PadFront(10))
Data Modules
torchchronos also provides Lightning compatible DataModules
to make it easy to load and preprocess data. They support common use cases like (multi-)GPU training and train/test/val-splitting out of the box. For example:
from torchchronos.lightning import UCRUEADataModule
from torchchronos.transforms import PadFront, PadBack
module = UCRUEAModule('ECG5000', split_ratio= (0.75, 0.15), batch_size= 32,
transforms=Compose([PadFront(10), PadBack(10)]))
Analogous the the datasets above, these dataloaders are supported as of now, wrapping the respective datasets:
torchchronos.lightning.UCRUEADataModule
torchchronos.lightning.TFCPretrainDataModule
Transforms
torchchronos provides a flexible transform API to preprocess time-series data. For example, to normalize a dataset, you can define a custom Transform
like this:
from torchchronos.transforms import Transform
class Normalize(Transform):
def __init__(self, mean=None, std=None):
self.mean = mean
self.std = std
def fit(self, data) -> Self:
self.mean = data.mean()
self.std = data.std()
return self
def __call__(self, data):
return (data - self.mean) / self.std
Known issues
- The dataset SpokenArabicDigits does not seem to work due to a missmatch of TRAIN and TEST size
- The dataset UrbanSound does not seem to work due to missing ts files
Roadmap
The following features are planned for future releases of torchchronos:
- Support for additional time-series datasets, including:
- Energy consumption dataset
- Traffic dataset
- PhysioNet Challenge 2012 (in-hospital mortality)
- PhysioNet Challenge 2019 (sepsis prediction) datasets
- Additional transform classes, including:
- Resampling
- Missing value imputation
If you have any feature requests or suggestions, please open an issue on our GitHub page.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for torchchronos-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09e485ee675ac191db8d0fd451ff8d992b2782eee56e65a60149a6dc79c0c0fa |
|
MD5 | c10b19dc61594d204b409a867c9bbd4d |
|
BLAKE2b-256 | e3fb90f5de85da3334470cd5297c3fe1d114939f8c01420ebbbe549b0765e78c |