No project description provided
Project description
TensorFlow time-series Dataset
About The Project
This python package should help you to create TensorFlow datasets for time-series data.
Installation
This package is available on PyPI. You install it and all of its dependencies using pip:
pip install tensorflow_time_series_dataset
Usage
Example Data
Suppose you have a dataset in the following form:
import numpy as np
import pandas as pd
# make things determeinisteic
np.random.seed(1)
columns=['x1', 'x2', 'x3']
periods=48 * 14
test_df=pd.DataFrame(
index=pd.date_range(
start='1/1/1992',
periods=periods,
freq='30min'
),
data=np.stack(
[
np.random.normal(0,0.5,periods),
np.random.normal(1,0.5,periods),
np.random.normal(2,0.5,periods)
],
axis=1
),
columns=columns
)
test_df.head()
x1 x2 x3
1992-01-01 00:00:00 0.812173 1.205133 1.578044
1992-01-01 00:30:00 -0.305878 1.429935 1.413295
1992-01-01 01:00:00 -0.264086 0.550658 1.602187
1992-01-01 01:30:00 -0.536484 1.159828 1.644974
1992-01-01 02:00:00 0.432704 1.159077 2.005718
Single-Step Prediction
The factory class WindowedTimeSeriesDatasetFactory
is used to create a TensorFlow dataset from pandas dataframes, or other data sources as we will see later.
We will use it now to create a dataset with 48
historic time-steps as the input to predict a single time-step in the future.
from tensorflow_time_series_dataset.factory import WindowedTimeSeriesDatasetFactory as Factory
factory_kwds=dict(
history_size=48,
prediction_size=1,
history_columns=['x1', 'x2', 'x3'],
prediction_columns=['x3'],
batch_size=4,
drop_remainder=True,
)
factory=Factory(**factory_kwds)
ds1=factory(test_df)
ds1
This returns the following TensorFlow Dataset:
<_PrefetchDataset element_spec=(TensorSpec(shape=(4, 48, 3), dtype=tf.float32, name=None), TensorSpec(shape=(4, 1, 1), dtype=tf.float32, name=None))>
We can plot the result with the utility function plot_path
:
from tensorflow_time_series_dataset.utils.visualisation import plot_patch
fig=plot_patch(
ds1,
figsize=(8,4),
**factory_kwds
)
fname='.images/example1.svg'
fig.savefig(fname)
fname
Multi-Step Prediction
Lets now increase the prediction size to 6
half-hour time-steps.
factory_kwds.update(dict(
prediction_size=6
))
factory=Factory(**factory_kwds)
ds2=factory(test_df)
ds2
This returns the following TensorFlow Dataset:
<_PrefetchDataset element_spec=(TensorSpec(shape=(4, 48, 3), dtype=tf.float32, name=None), TensorSpec(shape=(4, 6, 1), dtype=tf.float32, name=None))>
Again, lets plot the results to see what changed:
fig=plot_patch(
ds2,
figsize=(8,4),
**factory_kwds
)
fname='.images/example2.svg'
fig.savefig(fname)
fname
Preprocessing: Add Metadata features
Preprocessors can be used to transform the data before it is fed into the model.
A Preprocessor can be any python callable.
In this case we will be using the a class called CyclicalFeatureEncoder
to encode our one-dimensional cyclical features like the time or weekday to two-dimensional coordinates using a sine and cosine transformation as suggested in this blogpost.
import itertools
from tensorflow_time_series_dataset.preprocessors import CyclicalFeatureEncoder
encs = {
"weekday": dict(cycl_max=6),
"dayofyear": dict(cycl_max=366, cycl_min=1),
"month": dict(cycl_max=12, cycl_min=1),
"time": dict(
cycl_max=24 * 60 - 1,
cycl_getter=lambda df, k: df.index.hour * 60 + df.index.minute,
),
}
factory_kwds.update(dict(
meta_columns=list(itertools.chain(*[[c+'_sin', c+'_cos'] for c in encs.keys()]))
))
factory=Factory(**factory_kwds)
for name, kwds in encs.items():
factory.add_preprocessor(CyclicalFeatureEncoder(name, **kwds))
ds3=factory(test_df)
ds3
This returns the following TensorFlow Dataset:
<_PrefetchDataset element_spec=((TensorSpec(shape=(4, 48, 3), dtype=tf.float32, name=None), TensorSpec(shape=(4, 1, 8), dtype=tf.float32, name=None)), TensorSpec(shape=(4, 6, 1), dtype=tf.float32, name=None))>
Again, lets plot the results to see what changed:
fig=plot_patch(
ds3,
figsize=(8,4),
**factory_kwds
)
fname='.images/example3.svg'
fig.savefig(fname)
fname
Contributing
Any Contributions are greatly appreciated! If you have a question, an issue or would like to contribute, please read our contributing guidelines.
License
Distributed under the Apache License 2.0
Contact
Marcel Arpogaus - marcel.arpogaus@gmail.com
Project Link: https://github.com/MArpogaus/tensorflow_time_series_dataset
Acknowledgments
Parts of this work have been funded by the Federal Ministry for the Environment, Nature Conservation and Nuclear Safety due to a decision of the German Federal Parliament (AI4Grids: 67KI2012A).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tensorflow_time_series_dataset-0.1.1.tar.gz
.
File metadata
- Download URL: tensorflow_time_series_dataset-0.1.1.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c3b274d93f43b4c7d13dedce74397efa931ae484c344a5e272cae18534d498a |
|
MD5 | f2205e0ba8cc1955888e6535bb29d7dc |
|
BLAKE2b-256 | 719b021c19c1f25695fc97ba6d164d98e0ea7064925b57c0c1616b66d986f5ca |
File details
Details for the file tensorflow_time_series_dataset-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: tensorflow_time_series_dataset-0.1.1-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b319dabccb90f3851eb492aed317d371643ab903dafb620b7e344a85ef5a86bf |
|
MD5 | 8f5833e2b9cfdd4e5759f7e3f64c89b1 |
|
BLAKE2b-256 | ca7ab1b209c0b163361e3054088e068ef751b5f60bc6e0483dce99c330092e79 |