Enhanced spatio-temporal predictions using less data with active deep learning

## Project description

The name altility stands for 'actively learning utility', and was first developed to help electric utilities in the process of placing new smart meters in space and collecting their data at different times. This package, however, can now be used and be further developed for any type of spatio-temporal prediction task.

### Installation:

pip install altility


### Docker:

For using altility within an Ubuntu docker container

docker run -it aryandoustarsam/altility


For using altility with Jupyter notebook inside a docker container

docker run -it -p 3333:1111 -v ~/path_to_data/data:/data aryandoustarsam/altility:jupyter
[inside running container]: jupyter notebook --ip 0.0.0.0 --port 1111 --no-browser --allow-root
[in local machine browser]: localhost:3333
[in local machine browser, type token shown in terminal]


### Usage guide:

At the core of the altility package stands the class altility.ADL_model. It bundles properties and methods of the active deep learning (ADL) model that we want to train. Bellow is a list of all parameters it takes when initialized, methods it contains and results it can generate.

Parameters
string
The name of active deep learning (ADL) model.
path_to_results (='results'):
string
The path to where resulting plots and values are supposed to be stored.
Methods
initialize(y, x_t=None, x_s=None, x_st=None, **kwargs): Initializes prediction model.
collect(x_t_cand=None, x_s_cand=None, x_st_cand=None, **kwargs): Collect candidate data with embedding uncertainty active learning.
train(y_picked, x_t_picked=None, x_s_picked=None, x_st_picked=None, **kwargs): Train model with queried labels of chosen candidate data points.
predict(y_pred=None, x_t_cand=None, x_s_cand=None, x_st_cand=None, **kwargs): Predict labels for unqueried candidate data points. If you are testing model, and have labels available, you can pass these and see the difference between true and predicted labels of unqueried candidate data points.

### Methods:

A complete lits of key word arguments or parameters that can be passed to ADL_model.initialize()

Parameters
y (required):
numpy array
Array or matrix of labels.
x_t (=None):
numpy array
Array or matrix of time-variant features.
x_s (=None):
numpy array
Array or matrix of space-variant features.
x_st (=None):
numpy array
Array or matrix of space-time-variant features.
encoder_layers (=1):
int
Choose how many neural network layers you want to use for encoding features.
network_layers (=1):
int
Choose how many layers you want to use after encoders. This is your network depth.
encoding_nodes_x_t (=100):
int
Choose the dimension of the encoding outcome of temporal features.
encoding_nodes_x_s (=100):
int
Choose the dimension of the encoding outcome of spatial features.
encoding_nodes_x_st (=100):
int
Choose the dimension of the encoding outcome of spatio-temporal features.
encoding_nodes_joint (=100):
int
Choose the dimension of the encoding outcome of entire concatenated feature vector.
nodes_per_layer_dense (=1000):
int
Choose how many nodes per dense layer you want to use. This determines the width of your network.
filters_per_layer_cnn (=16):
int
Choose how many filtes per convolutional layer you want to use.
states_per_layer_rnn (=200):
int
Choose how many states per recurrent layer you want to use.
activation_encoding (='relu'):
string
Choose which activation function to use on last encoding layer. Choose from None, 'relu', 'tanh', 'selu', 'elu', 'exponential'.
activation_dense (='relu'):
string
Choose which activation function to use in each dense layer. Choose from None, 'relu', 'tanh', 'selu', 'elu', 'exponential'.
activation_cnn (='relu'):
string
Choose which activation function to use in each convolutional layer. Choose from None, 'relu', 'tanh', 'selu', 'elu', 'exponential'.
activation_rnn (='tanh'):
string
Choose which activation function to use in each recurrent layer. Choose from None, 'relu', 'tanh', 'selu', 'elu', 'exponential'.
layer_type_x_st (='CNN'):
string
Choose which layers to use for X_st inputs. Choose one from 'ANN', 'CNN', 'LSTM'.
initialization_method (='glorot_normal'):
string
Choose how to initiliaze weights for Conv1D, Conv2D and Dense layers. Choose from 'glorot_normal'.
initialization_method_rnn (='orthogonal'):
string
Choose how to initiliaze weights for LSTM layers. Choose from 'orthogonal'.
regularizer (='l1_l2'):
string
Choose how to regularize weights. Choose from None, 'l1', 'l2', 'l1_l2'.
batch_normalization (=False):
bool
Choose whether or not to use batch normalization on each layer in your NN.
train_split (=0.7):
float
Choose on the splitting ratio between training and validation datasets. Choose a value between 0 and 1.
split_intervals (=0.05):
float
Decide in which frequency to do train-validation split. 1 equals one datapoint per bin, 0.5 equals two datapoints per bin.
random_seed (=None):
float
Provide a seed for reproducibility of your experiments. This is then used when initializing weights of deep learning model, when choosing random data sequences during training and anywhere, where stochastic processes play a role.
epochs (=30):
int
Choose for how many epochs you want to train your model.
patience (=10):
int
Choose how many epochs to have patience on not increasing validation loss during training before early stopping.
batch_size (=16):
int
Choose how large your data batch size should be during training. Choose a value to the power of 2.
monitor (='val_loss'):
string
Choose which value to monitor for early stopping. Choose from 'val_loss' and 'train_loss'.
silent (=True):
bool
Decide whether or not to print out progress.
plot (=False):
bool
Decide whether or not to visualize process.
Results
models:
list of Tensorflow models
List of computational graphs that compound our active deep learning embedding network.

A complete lits of key word arguments or parameters that can be passed to ADL_model.collect()

Parameters
x_t_cand (=None):
numpy array
Array or matrix of time-variant features for candidate data points.
x_s_cand (=None):
numpy array
Array or matrix of space-variant features for candidate data points.
x_st_cand (=None):
numpy array
Array or matrix of space-time-variant features for candidate data points.
budget (=0.5):
float
Choose which share of candidate data pool we want to select. This is our data budget for new querying new data points. Choose a value between 0 and 1.
method (='embedding_uncertainty'):
string
Choose which active learning method to use. Currently, only queries with embedding uncertainty are supported.
method_variant (='max_uncertainty'):
string
Choose which variant of the active learning method to use. Choose from 'max_uncertainty', 'min_uncertainty', 'avg_uncertainty' and 'rnd_uncertainty'.
method_distance (='laplacian_kernel'):
string
Choose which distance metric to use for calculating embedding uncertainty to cluster centers. Choose from 'rbf_kernel', 'laplacian_kernel' and 'cosine_similarity'.
method_cluster (='KMeans'):
string
Choose which clusting method to use for clusting embedded candidate data points. Choose from 'rbf_kernel', 'laplacian_kernel' and 'cosine_similarity'.
subsample (=None):
int
Choose None or a subsample size of uniformly chosen candidates.
silent (=True):
bool
Decide whether or not to print out progress.
plot (=False):
bool
Decide whether or not to visualize process.
Results
batch_index_list:
list of integers
List of indices for most informative data points suggested to collect.
inf_score_list:
list of floats
List of information scores for most informative data points suggested to collect.

A complete lits of key word arguments or parameters that can be passed to ADL_model.train()

Parameters
y_picked (required):
numpy array
Array or matrix of labels.
x_t_picked (=None):
numpy array
Array or matrix of time-variant features.
x_s_picked (=None):
numpy array
Array or matrix of space-variant features.
x_st_picked (=None):
numpy array
Array or matrix of space-time-variant features.
silent (=True):
bool
Decide whether or not to print out progress.
plot (=False):
bool
Decide whether or not to visualize process.
Results
models:
list of Tensorflow models
List of computational graphs that compound our active deep learning embedding network further trained on the passed dataset of picked candidate data.

A complete lits of key word arguments or parameters that can be passed to ADL_model.predict()

Parameters
y_pred (=None):
numpy array
Array or matrix of labels.
x_t_pred (=None):
numpy array
Array or matrix of time-variant features.
x_s_pred (=None):
numpy array
Array or matrix of space-variant features.
x_st_pred (=None):
numpy array
Array or matrix of space-time-variant features.
silent (=True):
bool
Decide whether or not to print out progress.
plot (=False):
bool
Decide whether or not to visualize process.
Results
predictions:
list of floats
List of predictions made for passed features.
testing_loss:
float
Testing loss score calculated from true vs. predicted labels. Only calculated if true labels 'y_pred' are provided.

### Datasets:

The package can be tested on datasets that are either publicly available, or which we make public for making spatio-temporal predictions. A first dataset consists of electric load that we provide in our Github repository. To prepare the data for usage with altility, use the prep_load_forecasting_data() function provided in load_forecasting.py with the following parameter and return values:

Parameters
string
The path to where data is stored. This is 'data/public/electric load forecasting/' in our original repository.
dataset_name (='profiles_100'):
string
Choose between 'profiles_100' and 'profiles_400'. These are two distinct datasets containing load profiles from either 100 or 400 industrial, commercial, and residential buildings of different sizes, shapes, consumption and occupancy patterns in Switzerland.
label_type (='feature_scaled'):
string
Decide which labels to consider. Choose from 'random_scaled' and 'feature_scaled'.
spatial_features (='histogram'):
string
Decide how to treat aerial imagery. Choose one from 'average' and 'histogram'.
meteo_types :
list
Decide which meteo data types to consider. Choose from 'air_density', 'cloud_cover', 'precipitation', 'radiation_surface', 'radiation_toa', 'snow_mass', 'snowfall', 'temperature' and 'wind_speed'. The default is a list of all meteorological conditions.
timestamp_data :
list
Decide which time stamp information to consider. Choose from: '15min', 'hour', 'day', 'month' and 'year'.
time_encoding (='ORD'):
string
Decide how to encode time stamp data. Choose one of 'ORD', 'ORD-1D' or 'OHE'
histo_bins (=100):
int
Set the number of histogram bins that you want to use. Applied if parameter spatial_features = 'histogram'.
grey_scale (=False):
bool
Decide whether you want to consider underlying RGB images in grey-scale.
profiles_per_year (=1):
float
Decide how many building-year profiles you want to consider for each year. Choose a share between 0 and 1. A value of 1 corresponds to about 100 profiles for the profiles_100 and 400 profiles for the profiles_400 dataset.
points_per_profile (=0.003):
float
Decide how many data points per building-year profile you want to consider. Choose a share between 0 and 1. A value of 0.01 corresponds to approximately 350 points per profile.
history_window_meteo (=24):
int
Choose past time window for the meteo data. Resolution is hourly.
prediction_window (=96):
int
Decide how many time steps to predict consumption into the future. Resolution is 15 min. A values of 96 corresponds to 24h.
test_split (=0.7):
float
Decides how many buildings and how much of the time period to separate for testing.
normalization (=True):
bool
Decide whether or not to normalize features.
standardization (=True):
bool
Decide whether to standardize features to zero mean and unit variance.
silent (=True):
bool
Decide whether or not to print out progress of data processing.
plot (=False):
bool
Decide whether or not to visualize examples of processed data.
Returns
datasets:
dict
A dictionary containing available and candidate data, that are stored with the keys 'avail_data' and 'cand_data'. These are dictionaries themselves, and store variables under keys 'x_t', 'x_s', 'x_st' and 'y'. These stand for only time-variant features 'x_t', only space-variant features 'x_s', space- and time-variant features 'x_st' and labels 'y'.

A second dataset consists of travel time data provided by the Uber movement project. Note: This data is licensed under Creative Commons, Attribution Non-Commercial (https://creativecommons.org/licenses/by-nc/3.0/us/). This is different from the MIT license we provide for our package here. To prepare the data for usage with altility, use the prep_travel_forecasting_data() function provided in travel_forecasting.py with the following parameters and return values.

Parameters
path_to_data (='data/public/travel time forecasting/'):
string
The path to where data is stored. This is 'data/public/travel time forecasting/' in our original repository.
dataset_name (='Uber movement'):
string
This is currently the only dataset source we provide for travel time data. An alternative source is the Google Maps API.
city_name (='Amsterdam'):
string
Choose a city for which you want to predict missing travel time data between their single city zones. All available cities can be seen under the path 'data/public/travel time forecasting/Uber movement/'.
test_split (=0.7):
float
Decides how many data to separate for creating the candidate data pool.
time_encoding (='ORD'):
string
Decide how to encode time stamp data. Choose one of 'ORD' for ordinal encoding or 'OHE' for one-hot encoding.
normalization (=True):
bool
Decide whether or not to normalize features.
standardization (=True):
bool
Decide whether to standardize features to zero mean and unit variance.
silent (=True):
bool
Decide whether or not to print out progress of data processing.
plot (=False):
bool
Decide whether or not to visualize examples of processed data.
Returns
datasets:
dict
A dictionary containing available and candidate data, that are stored with the keys 'avail_data' and 'cand_data'. These are dictionaries themselves, and store variables under keys 'x_t', 'x_s' and 'y'. These stand for only time-variant features 'x_t', only space-variant features 'x_s' and labels 'y'.

### Examples:

An example for forecasting electric consumption of single buildings.

import altility.adl_model as adl_model

### Import and prepare load forecasting data
silent=False,
plot=True
)

### Get features and labels for available data
y = datasets['avail_data']['y']
x_t = datasets['avail_data']['x_t']
x_s = datasets['avail_data']['x_s']
x_st = datasets['avail_data']['x_st']

### Get features and labels for candidate data from spatio-temporal test set
y_cand = datasets['cand_data']['y']
x_t_cand = datasets['cand_data']['x_t']
x_s_cand = datasets['cand_data']['x_s']
x_st_cand = datasets['cand_data']['x_st']

### Create a class instance

### Initialize model by creating and training it
y,
x_t,
x_s,
x_st,
silent=True,
plot=True
)

### Collect candidate data
x_t_cand,
x_s_cand,
x_st_cand,
silent=True,
plot=False
)

### Create one array for picked and one for unpicked data to be predicted
picked_array = np.zeros([len(y_cand),], dtype=bool)
pred_array = np.invert(picked_array)

### Extract selected data from candidate data pool for training
y_picked = y_cand[picked_array]
x_t_picked = x_t_cand[picked_array]
x_s_picked = x_s_cand[picked_array]
x_st_picked = x_st_cand[picked_array]

### Train model with picked data
y_picked,
x_t_picked,
x_s_picked,
x_st_picked,
silent=False,
plot=True
)

### Extract not selected data from candidate data pool for testing/predicting
y_pred = y_cand[pred_array]
x_t_pred = x_t_cand[pred_array]
x_s_pred = x_s_cand[pred_array]
x_st_pred = x_st_cand[pred_array]

### Predict on remaining data
y_pred,
x_t_pred,
x_s_pred,
x_st_pred,
silent=False,
plot=True
)


An example for forecasting travel times between single city zones.

import altility.adl_model as adl_model
import altility.datasets.travel_forecasting as travel_forecasting

### Import and prepare travel forecasting data
datasets = travel_forecasting.prep_travel_forecasting_data(
silent=False,
plot=True
)

### Get features and labels for available data
n_points=1000
y = datasets['avail_data']['y'][:n_points]
x_t = datasets['avail_data']['x_t'][:n_points]
x_s = datasets['avail_data']['x_s'][:n_points]

### Get features and labels for candidate data from spatio-temporal test set
y_cand = datasets['cand_data']['y'][:n_points]
x_t_cand = datasets['cand_data']['x_t'][:n_points]
x_s_cand = datasets['cand_data']['x_s'][:n_points]

### Create a class instance

### Initialize model by creating and training it
y,
x_t=x_t,
x_s=x_s,
silent=True,
plot=True
)

### Show us if we created all models
print(model_name)

### Collect candidate data
x_t_cand,
x_s_cand,
silent=False,
plot=True
)

### Create one array for picked and one for unpicked data to be predicted
picked_array = np.zeros([len(y_cand),], dtype=bool)
pred_array = np.invert(picked_array)

### Extract selected data from candidate data pool for training
y_picked = y_cand[picked_array]
x_t_picked = x_t_cand[picked_array]
x_s_picked = x_s_cand[picked_array]

### Train model with picked data
y_picked,
x_t_picked,
x_s_picked,
silent=False,
plot=True
)

### Extract not selected data from candidate data pool for testing/predicting
y_pred = y_cand[pred_array]
x_t_pred = x_t_cand[pred_array]
x_s_pred = x_s_cand[pred_array]

### Predict on remaining data
y_pred,
x_t_pred,
x_s_pred,
silent=False,
plot=True
)


## Project details

Uploaded source
Uploaded py3