Time series data sets for PyTorch
Project description
Time series data sets for PyTorch
torchtime provides ready-to-go time series data sets for use in PyTorch. The current list of supported data sets is:
The package follows the batch first convention. Data tensors are therefore of shape (n, s, c) where n is batch size, s is trajectory length and c are the number of channels.
Installation
$ pip install torchtime
Using torchtime
The example below uses the torchtime.data.UEA class. The data set is specified using the dataset argument (see list here). The split argument determines whether training, validation or test data are returned. The size of the splits are controlled with the train_split and val_split arguments.
For example, to load training data for the ArrowHead data set with a 70% training, 20% validation and 10% testing split:
from torch.utils.data import DataLoader
from torchtime.data import UEA
arrowhead = UEA(
dataset="ArrowHead",
split="train",
train_split=0.7,
val_split=0.2,
)
dataloader = DataLoader(arrowhead, batch_size=32)
Batches are dictionaries of tensors X, y and length. X are the time series data with an additional time stamp in the first channel, y are one-hot encoded labels and length are the length of each trajectory.
ArrowHead is a univariate time series with 251 observations in each trajectory. X therefore has two channels, the time stamp followed by the time series. A batch size of 32 was specified above therefore X has shape (32, 251, 2).
>> next(iter(dataloader))["X"].shape
torch.Size([32, 251, 2])
>> next(iter(dataloader))["X"]
tensor([[[ 0.0000, -1.8295],
[ 1.0000, -1.8238],
[ 2.0000, -1.8101],
...,
[248.0000, -1.7759],
[249.0000, -1.8088],
[250.0000, -1.8110]],
...,
[[ 0.0000, -2.0147],
[ 1.0000, -2.0311],
[ 2.0000, -1.9471],
...,
[248.0000, -1.9901],
[249.0000, -1.9913],
[250.0000, -2.0109]]])
There are three classes therefore y has shape (32, 3).
>> next(iter(dataloader))["y"].shape
torch.Size([32, 3])
>> next(iter(dataloader))["y"]
tensor([[0, 0, 1],
...,
[1, 0, 0]])
Finally, length is the length of each trajectory (before any padding for data sets of irregular length) and therefore has shape (32).
>> next(iter(dataloader))["length"].shape
torch.Size([32])
>> next(iter(dataloader))["length"]
tensor([251, ..., 251])
Learn more
Other features include missing data simulation for UEA data sets. See the API for more information.
This work is based on some of the data processing ideas in Kidger et al, 2020 [link] and Che et al, 2018 [link].
License
Released under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file torchtime-0.1.0.tar.gz.
File metadata
- Download URL: torchtime-0.1.0.tar.gz
- Upload date:
- Size: 9.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.8.10 Linux/5.13.0-37-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d227d9f8aafb9fb12324a370c356f343fe31a5fb9f158fcbc0f1fdb0268f5d8
|
|
| MD5 |
fd42fa22beb77aa81be8f83c6b40a766
|
|
| BLAKE2b-256 |
9db611a1d2699f3a0a7e5d76a31a4567b3257928590d3638cf425cfda67869be
|
File details
Details for the file torchtime-0.1.0-py3-none-any.whl.
File metadata
- Download URL: torchtime-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.8.10 Linux/5.13.0-37-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5124ee95551cd07d5ded162b49e0ba137cc3505404012cf8e6c30396d7bb41b8
|
|
| MD5 |
ca0785c341d8129f8eb6b1de36a8df32
|
|
| BLAKE2b-256 |
ab4d0f6699a08f013167e862e765a60fe7e1981922a7b809c9fb4c2801d6ef09
|