Karhunen Loève decomposed Gaussian processes with forward variable selection

Project description

x FoKL-GPy Logo

About FoKL
Installation and Setup
Use Cases
User Documentation
Benchmarks and Papers
Future Development
Contact Us
License
Citations

About FoKL

FoKL-GPy, or FoKL, is a Python package intended for use in machine learning. The name comes from a unique implementation of Forward variable selection using Karhunen-Loève decomposed Gaussian Processes (GP's) in Python (i.e., FoKL-GPy).

The primary advantages of FoKL are:

Fast inference on static and dynamic datasets using scalable GP regression
Significant accuracy retained despite being fast

Some other advantages of FoKL include:

Export modeled non-linear dynamics as a symbolic equation (i.e., use a GP model in Pyomo)
Take first and second derivatives of model with respect to any input variable (e.g., gradient)
User-friendly (e.g., automatic handling of various dataset formats, automatic creation of training set, etc.)
Easy adjusting of hyperparameters for sweeping through variations in order to find optimal settings
Ability to save, share, and load models
Ability to import and evaluate a model without known data (i.e., without training)

To read more about FoKL, please see the Benchmarks and Papers section.

Installation and Setup

From your command-line terminal, FoKL is available through PyPI:

pip install FoKL

Alternatively, the GitHub repository may be cloned to create a local copy in which the examples and documentation will be included:

git clone https://github.com/ESMS-Group-Public/FoKL-GPy

Once installed, import the FoKL module in Python with:

from FoKL import FoKLRoutines

From here, the FoKL class object may be created and its methods accessed. Please see Use Cases to learn more about working with FoKL models.

Use Cases

Please first refer to the following for tutorials and examples:

Then, see User Documentation as needed.

User Documentation

FoKLRoutines
- load
- FoKL
  - clean
  - generate_trainlog
  - trainset
  - bss_derivatives
  - evaluate_basis
  - evaluate
  - coverage3
  - fit
  - clear
  - to_pyomo
  - save
fokl_to_pyomo
getKernels
GP_integrate

FoKLRoutines

The FoKLRoutines module houses the primary routines for a FoKL model. Namely, these are the load function and FoKL class object.

load

model = FoKLRoutines.load(filename, directory=None)

Load a FoKL class from a file. If failing to load a file and/or directory relative to the run script, ensure the terminal directory is set to that of the run script.

By default, directory is the current working directory that contains the script calling this method. An absolute or relative directory may be defined if the model to load is located elsewhere.

For simplicity, enter the returned output from save as the argument here, i.e., for filename. Do this while leaving directory blank since filename can simply include the directory itself.

FoKL

model = FoKLRoutines.FoKL(**kwargs)

This creates a class object that contains all information relevant to and defining a FoKL model.

Upon initialization, hyperparameters and some other settings are defined with default values as attributes of the FoKL class. These attributes are as follows, and any or all may be specified as a keyword or later updated by redefining the a class attribute.

Type	Keyword Argument	Default Value	Description
hyperparameter	`kernel`	`'Cubic Splines'`	Format of basis functions from the BSS-ANOVA kernel to use for training a model (`'Cubic Splines'` or `'Bernoulli Polynomials'`)
"	`phis`	$f($ `kernel` $)$	Data structure with coefficients for basis functions
"	`relats_in`	`[]`	Boolean matrix indicating which input variables and/or interactions should be excluded from the model
"	`a`	`4`	Shape parameter of the initial-guess distribution for the observation error variance of the data
"	`b`	$f($ `a`, `data` $)$	Scale parameter of the initial-guess distribution for the observation error variance of the data
"	`atau`	`4`	Parameter of the initial-guess distribution for the $\tau^2$ parameter
"	`btau`	$f($ `atau`, `data` $)$	Parameter of the initial-guess distribution for the $\tau^2$ parameter
"	`tolerance`	`3`	Influences how long to continue training after additional terms yield diminishing returns
"	`burnin`	`1000`	Total number of draws from the posterior for each tested model before the `draws` draws
"	`draws`	`1000`	Total number of draws from the posterior for each tested model after the `burnin` draws
"	`gimmie`	`False`	Boolean to return the most complex model tried instead of the model with the optimum Bayesian information criterion (BIC)
"	`way3`	`False`	Boolean to include three-way interactions
"	`threshav`	`0.05`	Threshold to propose terms for elimination. Increase to propose and eliminate more terms
"	`threshstda`	`0.5`	Threshold to eliminate terms based on standard deviation relative to mean
"	`threshstdb`	`2`	Threshold to eliminate terms based on standard deviation independent of mean
"	`aic`	`False`	Boolean to use Aikaike information criterion (AIC)
setting	`UserWarnings`	`True`	Boolean to print user-warnings (i.e., FoKL warnings) to command terminal
"	`ConsoleOutput`	`True`	Boolean to print progress of model training to command terminal

The following methods are embedded within the class object:

Method	Description
clean	Automatically format and normalize user-provided dataset.
generate_trainlog	Generate random indices of the dataset to use as a training set.
trainset	Return the training set.
bss_derivatives	Algebraically calculate partial derivatives of model with respect to input variables.
evaluate_basis	Calculate value of specified basis function at single point along normalized domain.
evaluate	Calculate values of FoKL model for all requested sets of datapoints.
coverage3	Evaluate FoKL model, calculate confidence bounds, calculate RMSE, and produce plot.
fit	Train new FoKL model to best-fit training dataset according to hyperparameters.
clear	Delete attributes from FoKL class so that new models may be trained without new class objects.
to_pyomo	Convert a FoKL model to an expression in a Pyomo model.
save	Save FoKL class with all its attributes to retain model and avoid re-training.

Each method has optional inputs that allow for flexibility in how FoKL is used so that you may leverage these methods for your specific requirements. Please refer to the Use Cases first, then explore the following documentation of each method as needed.

clean

model.clean(inputs, data=None, **kwargs)

Automatically format and normalize datasets. Note that data is not required but should be entered if available; otherwise, leave blank. Multiple options are available to govern the normalization of inputs. See Automatically formatting and normalizing datasets for example usage.

Input	Type	Description	Default
`inputs`	any	$n \times m$ input matrix $\mathbf{x}$ of $n$ instances by $m$ features in model $\overline{y}=f(\overline{x}_1,...,\overline{x}_m)$	n/a
`data`	any	$n \times 1$ output vector $\overline{y}$ of $n$ instances in model $\overline{y}=f(\overline{x}_1,...,\overline{x}_m)$	`None`

Keyword	Type	Description	Default
`train`	scalar	(0,1] fraction of $n$ instances to use for training	`1`
`AutoTranspose`	boolean	assumes $n > m$ and transposes dataset accordingly	`True`
`bit`	integer	(16, 32, 64) floating point bits to save dataset as	`64`
`normalize`	boolean	to pass formatted dataset to `_normalize()`	`True`
`minmax`	list of [min, max] lists	upper/lower bounds of each input variable	model.minmax
`pillow`	list of [lower, upper] lists	fraction of span by which to expand [min, max]; or, values on 0-1 scale that [min, max] should map to	`0`

After calling clean, the now normalized and formatted dataset gets saved as attributes of the FoKL class. Be sure to use these attributes in place of the originally entered inputs and data so that normalization and formatting errors are avoided. The attributes are as follows:

Attribute	Type	Description
`model.inputs`	$n \times m$ ndarray	normalized and formatted `inputs`
`model.data`	$n \times 1$ ndarray	formatted `data`
`model.minmax`	list of $m$ lists	[min, max] factors used to normalize `inputs` to `model.inputs`
`model.trainlog`	$n \times 1$ ndarray	logical index of instances from dataset to use as training set

To then access the training set [traininputs, traindata], see trainset.

generate_trainlog

model.trainlog = model.generate_trainlog(train, n=None)

Generate random logical vector of length $n$ with train percent as True. It is expected that generate_trainlog will be called internally by clean and not by the user, though this method is available if sweeping through values of train in order to compare the accuracy of models fitted to training sets of different sizes.

trainset

traininputs, traindata = model.trainset()

Run this line to access the training set, which is simply model.inputs and model.data indexed by model.trainlog. See clean for how model.inputs and model.data get defined and/or generate_trainlog for how model.trainlog gets defined.

bss_derivatives

dy = model.bss_derivatives(**kwargs)

For returning gradient of modeled function with respect to each, or specified, input variable. If user overrides default settings, then 1st and 2nd partial derivatives can be returned for any variables.

Keyword	Type	Description	Default
`inputs`	-	see `model.inputs` of clean	`model.inputs`
`kernel`	-	see `kernel` of FoKL	`model.kernel`
`d1`	integer (for single) or list of booleans (for multiple)	index of input variable(s) (i.e., state(s)) to use for first partial derivative; see tip below	`True`
`d2`	integer (for single) or list of booleans (for multiple)	index of input variable(s) (i.e., state(s)) to use for second partial derivative; see tip below	`False`
`draws`	-	see `draws` of FoKL	`model.draws`
`betas`	-	see `betas` of FoKL	`model.betas`
`phis`	-	see `phis` of FoKL	`model.phis`
`mtx`	$(terms-1) \times m$ ndarray	interaction matrix defining terms in FoKL model by indexing basis function order for each term and input variable combination	`model.mtx`
`minmax`	-	see `minmax` of clean	`model.minmax`
`IndividualDraws`	boolean	for returning derivative(s) at each draw	`False`
`ReturnFullArray`	boolean	for returning $n \times m \times 2$ array with zeros for non-requested states such that indexing is preserved; otherwise, only requested states are squeezed into a 2D matrix where columns correspond to increasing input variable index and derivative order	`False`

Output	Type	Description	Default
`dy`	$n \times m \times 2$ ndarray if `ReturnFullArray=True`, else $n \times m_{\delta}$ where $m_{\delta}$ is the number of partial derivatives requested	derivative of model with respect to input variable(s) (i.e., state(s)) defined by `d1` and `d2`	gradient (i.e., $n \times m_{\delta}$ ndarray where $m_{\delta} =m$ because `d1=True, d2=False`

Tip:

To turn off all first-derivatives, set d1=False instead of d1=0. The reason is d1 and d2, if set to an integer, will return the derivative with respect to the input variable indexed by that integer using Python indexing. In other words, for a two-input FoKL model, setting d1=1 and d2=0 will return the first-derivative with respect to the second input (d1=1) and the second-derivative with respect to the first input (d2=0). Alternatively, d1=[False, True] and d2=[True, False] will function the same so that boolean lists may be used in cases where the derivative with respect to more than one state, but not all states, is required.

evaluate_basis

basis = model.evaluate_basis(c, x, kernel=None, d=0)

Evaluate a basis function at a single point by providing coefficients, $x$ value(s), and (optionally) the kernel. This method is primarily used internally by other methods and so is not expected to be used by the user, but is available for testing purposes and to provide insight toward how the basis functions get evaluated.

For evaluating a FoKL model, see evaluate.

Input	Type	Description
`c`	list of scalars	coefficients of the basis function or its derivative
`x`	scalar	value of independent variable at which to evaluate the basis function or its derivative

Keyword	Type	Description	Default
`kernel`	-	see `kernel` of FoKL	`model.kernel`
`d`	integer	order of derivative (where 0 is no derivative)	`0`

Output	Type	Description
`basis`	scalar	evaluation of basis function or its derivative at `x`

If insightful for understanding how to define c, the values of kernel and order d correspond to the following equations at which basis is evaluated:

Kernel	Order	Basis Function $B_i$ or its Derivative
`'Cubic Splines'`	`d=0`	$B_i=c_0+c_1 \cdot x+c_2 \cdot x^2+c_3 \cdot x^3 \implies$ `c[0] + c[1] * x + c[2] * (x ** 2) + c[3] * (x ** 3)`
"	`d=1`	$\frac{\partial}{\partial x}(B_i)=c_1+2\cdot c_2\cdot x+3\cdot c_3\cdot x^2 \implies$ `c[1] + 2 * c[2] * x + 3 * c[3] * (x ** 2)`
"	`d=2`	$\frac{\partial^2}{\partial x^2}(B_i)=2\cdot c_2+6\cdot c_3\cdot x \implies$ `2 * c[2] + 6 * c[3] * x`
`'Bernoulli Polynomials'`	`d=0`	$B_i=\sum_{k=0}^{i} (c_k \cdot x^k)\implies$ `c[0] + sum(c[k] * (x ** k) for k in range(1, len(c)))`
"	`d=1`	$\frac{\partial}{\partial x}(B_i)=\sum_{k=1}^{i} (k \cdot c_k \cdot x^{k-1})\implies$ `c[1] + sum(k * c[k] * (x ** (k - 1)) for k in range(2, len(c)))`
"	`d=2`	$\frac{\partial^2}{\partial x^2}(B_i)=\sum_{k=2}^{i} (k \cdot (k-1) \cdot c_k \cdot x^{k-2})\implies$ `sum((k - 1) * k * c[k] * (x ** (k - 2)) for k in range(2, len(c)))`

When called internally by evaluate, the coefficients c (i.e., $\overline{c}_i$) automatically correspond to $i$ such that $B_i=f(\overline{c}_i)$. For 'Cubic Splines', this is achieved by c = list(model.phis[i - 1][k][phind] for k in range(4)) where phind $=f(x)$. For 'Bernoulli Polynomials', this is achieved by c = model.phis[i - 1].

evaluate

mean = model.evaluate(inputs=None, betas=None, mtx=None, **kwargs)

Evaluate the FoKL model for provided inputs and (optionally) calculate bounds.

Input	Type	Description	Default
`inputs`	-	see `model.inputs` of clean	`model.inputs`
`betas`	-	see `betas` of fit	`model.betas`
`mtx`	-	see `mtx` of fit	`model.mtx`

Keyword	Type	Description	Default
`minmax`	-	see `minmax` of clean	`None`
`draws`	-	see `draws` of FoKL	`model.draws`
`clean`	boolean	pass `inputs` to clean if true; note this will override `minmax` and result in `inputs` scaled to 0-1	`False`
`ReturnBounds`	boolean	return 95% confidence bounds as second output if true	`False`

If clean=True, then any keywords documented for clean may be used here.

Output	Type	Description
`mean`	$n \times 1$ ndarray	prediction of $\overline{y}$ in $\overline{y}=f(\overline{x}_1,...,\overline{x}_m)$ for provided `inputs`; prediction of `model.data` defined in clean by default (i.e., `inputs=model.inputs`)
`bounds` (optional)	$n \times 2$ ndarray	upper and lower bounds for 95% confidence interval of predicting; returned if `ReturnBounds=True`

coverage3

mean, bounds, rmse = model.coverage3(**kwargs)

For validation testing of a FoKL model. Default functionality is to evaluate all inputs (i.e., train and test sets) using evaluate. Returned is the predicted output mean, 95% confidence bounds bounds, and Root Mean Square Error rmse. A plot may be returned by setting plot=True; or, for a potentially more meaningful plot in terms of judging accuracy, plot='sorted' will plot the data in increasing value.

To govern what is passed to evaluate:

Keyword	Type	Description	Default
`inputs`	-	see `model.inputs` of clean	`model.inputs`
`data`	-	see `model.data` of clean	`model.data`
`draws`	-	see `draws` of FoKL	`model.draws`
`nrmse`	-	normalized root mean square error	False

To govern basic plot controls:

Keyword	Type	Description	Default
`plot`	boolean or string	for generating plot; set to `'sorted'` for plot of ordered data	`False`
`bounds`	boolean	for plotting bounds	`True`
`xaxis`	integer	index of the input variable to plot along the x-axis	indices
`labels`	boolean	for adding labels to plot	`True`
`xlabel`	string	x-axis label	`'Index'`
`ylabel`	string	y-axis label	`'Data'`
`title`	string	plot title	`'FoKL'`
`legend`	boolean	for adding legend to plot	`True`
`LegendLabelFoKL`	string	FoKL's label in legend	`'FoKL'`
`LegendLabelData`	string	Data's label in legend	`'Data'`
`LegendLabelBounds`	string	Bounds's label in legend	`'Bounds'`

To govern detailed plot controls:

Keyword	Type	Description	Default
`PlotTypeFoKL`	string	FoKL's color and line type	`'b'`
`PlotSizeFoKL`	scalar	FoKL's line size	`2`
`PlotTypeBounds`	string	Bounds' color and line type	`'k--'`
`PlotSizeBounds`	scalar	Bounds' line size	`2`
`PlotTypeData`	string	Data's color and line type	`'ro'`
`PlotSizeData`	scalar	Data's line size	`2`

Output	Type	Description
`mean`	-	see `mean` of evaluate
`bounds`	-	see `bounds` of evaluate
`rmse`	scalar	Root Mean Squared Error (RMSE) of prediction in relation to `data`

fit

betas, mtx, evs = model.fit(inputs=None, data=None, **kwargs)

Training routine for fitting model to known inputs and data.

Input	Type	Description	Default
`inputs`	-	see `traininputs` of trainset	`traininputs, _ = model.trainset()`
`data`	-	see `traindata` of trainset	`_, traindata = model.trainset()`

Keyword	Type	Description	Default
`clean`	boolean	pass `inputs` and `data` to clean if true	`False`
`ConsoleOutput`	boolean	print [ind, ev] to console during FoKL model generation; will print percent completed of each Gibbs sampler call prior to [ind, ev] if large dataset (i.e., if less than 64-bit was requested in clean)	`True`

If clean=True, then any keywords documented for clean may be used here.

Output	Type	Description
`betas`	$draws \times terms$ ndarray	draws from the posterior distribution of coefficients, with rows corresponding to draws (i.e., a single set of coefficients) and columns corresponding to terms in the model (i.e., $\beta_0, \beta_1, \dots $)
`mtx`	$(terms-1) \times m$ ndarray	interaction matrix defining order of basis function for term/variable combinations in FoKL model, with rows corresponding to terms (i.e., columns of `betas` beyond the first column) and columns corresponding to input variables (i.e., columns of `model.inputs`)
`evs`	ndarray	vector of BIC values corresponding to each proposed model during training

clear

model.clear(keep=None, clear=None, all=False)

Delete all attributes from the FoKL class except for hyperparameters and settings, unless otherwise specified by the clear keyword. If an attribute is listed in both the clear and keep keywords, then the attribute is cleared.

Input	Type	Description	Default
`keep`	list of strings	attributes to keep in addition to hyperparameters and settings, e.g., `keep=['inputs', 'mtx']`	`model.keep`
`clear`	list of strings	hyperparameters to delete, e.g., `clear=['kernel', 'phis']`	`None`
`all`	boolean	if `True` then all attributes (including hyperparameters) get deleted regardless	`False`

Note when the FoKL class was initialized, model.keep got defined by default as a list of strings including the names of all hyperparameters and settings. These then get preserved here by default.

To remove all attributes from the class, simply call:

model.clear(all=True)

to_pyomo

FoKL models can be converted to pyomo. If you want to use pyomo, you can either install the package directly or

pip install FoKL[pyomo]

m = model.to_pyomo(xvars, yvars, m=None, xfix=None, yfix=None, truescale=True, std=True, draws=None)

Pass arguments to fokl_to_pyomo. If embedding a single GP in Pyomo rather than multiple, it is recommended to use this method to avoid importing an additional module in the run script.

save

filepath = model.save(filename=None, directory=None)

Save a FoKL class as a file with extension '.fokl'. If not saving where expected relative to the run script, ensure the terminal directory is set to that of the run script.

Both inputs are optional. By default, filename is of the form 'model_yyyymmddhhmmss.fokl' and is saved to the current directory. To change the directory, embed within filename or assign to directory if using the default filename format.

Returned is filepath. Enter this as the argument to load to later reload the model. Explicitly, that is:

FoKLRoutines.load(filepath)

Input	Type	Description
`filename`	string	name of file to save model as (note '.fokl' extension can be automatically or manually appended)
`directory`	string	absolute or relative path to pre-existing folder in which to save `filename`

Output	Type	Description
`filepath`	string	absolute path to where the file was saved

fokl_to_pyomo

from FoKL.fokl_to_pyomo import fokl_to_pyomo
m = fokl_to_pyomo(models, xvars, yvars, m=None, xfix=None, yfix=None, truescale=True, std=True, draws=None)

Embed GP's in Pyomo by automatically converting draws from FoKL models trained with or defined by the 'Bernoulli Polynomials' kernel to symbolic expressions in a Pyomo model.

Defining the Pyomo model's objective and any other constraints must be done outside of fokl_to_pyomo. The user must then define an appropriate solver for the Pyomo model. The following is known to work for global optimization problems, which will likely be required for a problem using a GP model.

solver = pyo.SolverFactory('multistart')
solver.solve(m, solver='ipopt')

For documentation on the components of the Pyomo model automatically generated, see nomenclature_of_fokl_to_pyomo.ipynb. The function arguments are as follows.

Input	Type	Description	Default
`models`	list of FoKL class objects	multiple FoKL models to be embedded in single Pyomo model	-
`xvars`	list of lists of strings	strings are input variable names, and lists correspond to models	-
`yvars`	list of strings	strings are output variable names corresponding to models	-
`m`	Pyomo model	pre-existing Pyomo model if existing	`pyo.ConcreteModel()`
`xfix`	list of lists of floats	floats are input variable values if known and to be fixed), and lists correspond to models	-
`yfix`	list of floats	floats are output variable values (if known and to be fixed) corresponding to models	-
`truescale`	list of lists of booleans	corresponding to the variables created by `xvars`, set `True` to use true scale (i.e., un-normalized) values and set `False` to use normalized values; unless `xvars` is to correspond to the normalized input variables, leave blank	`[[True, ..., True], ..., [True, ..., True]]`
`std`	list of booleans	set `False` if standard deviation of FoKL model (corresponding to position in list) is not needed so Pyomo model only defines mean	`[[True, ..., True], ..., [True, ..., True]]`
`draws`	int	number of most recent draws to embed in Pyomo	`model.draws`

getKernels

For internal use.

This package is used by FoKL during initialization to return the data structure phis, containing coefficients for the basis functions specified by kernel.

import FoKL.getKernels
phis = getKernels.sp500()  # kernel == 'Cubic Splines'
phis = getKernels.bernoulli()  # kernel == 'Bernoulli Polynomials'

GP_integrate

from FoKL.GP_Integrate import GP_Integrate
T, Y = GP_Integrate(betas, matrix, b, norms, phis, start, stop, y0, h, used_inputs)

Integrate FoKL models that were fitted to derivatives. Multiple models are able to be integrated simulatneously. Currently, only models trained on the "Cubic Splines" basis functions are supported.

For example, training model1 on $x = f(\dot{x}, b_1)$ and model2 on $y = f(\dot{y}, b_2)$ is as usual. Then, to integrate the models with constants $(b_1, b_2)$ set to b and initial conditions $(x_0, y_0)$ set to y0,

betas1, mtx1, _ = model1.fit([xdot, b1], x)
betas2, mtx2, _ = model2.fit([ydot, b2], y)

T, Y = GP_integrate([np.mean(betas1, axis=0), np.mean(betas2, axis=0)], 
                    [mtx1, mtx2], 
                    [b1, b2], 
                    ..., 
                    [x0, y0], 
                    ...)

Input	Description
`betas`	`betas` is a list of arrays in which each entry to the list contains a specific row of the betas matrix, or the mean of the betas matrix for each model being integrated.
`matrix`	`matrix` is a list of arrays containing the interaction matrix of each model.
`b`	`b` is an array of the values of all the other inputs to the model(s) (including any forcing functions) over the time period we integrate over. The length of `b` should be equal to the number of points in the final time series `(stop - start) / h`. All values in `b` need to be normalized with respect to the min and max values of their respective values in the training dataset.
`norms`	`norms` is a matrix of the min and max values of all the inputs being integrated (in the same order as `y0`). Min values are in the top row, max values in the bottom.
`phis`	`phis` is a data structure with coefficients for basis functions.
`start`	`start` is the time at which integration begins.
`stop`	`stop` is the time to end integration.
`y0`	`y0` is an array of the inital conditions for the models being integrated.
`h`	`h` is the step size with respect to time.
`used_inputs`	`used_inputs` is a list of arrays containing the information as to what inputs are used in what model. Each array should contain a vector corresponding to a different model. Inputs should be referred to as those being integrated first, followed by those contained in `b` (in the same order as they appear in `y0` and `b` respectively).

Output	Description
`T`	`T` is an array of the time steps the models are integrated at.
`Y`	`Y` is an array of the models that have been integrated, at the time steps contained in `T`.

To demonstrate used_inputs, suppose two models were being integrated with 3 other inputs total. The 1st model uses the output of both models as inputs; and, the 1st and 3rd additional inputs. The 2nd model uses its own output as an input; and, the 2nd and 3rd additional inputs. This yields

used_inputs = [[1, 1, 1, 0, 1], [0, 1, 0, 1, 0]]

If the models created do not follow this ordering scheme for their inputs, the inputs can be rearranged based upon an alternate numbering scheme provided to used_inputs. E.g., if the inputs need to be reordered then the 1st input should have a '1' in its place in the used_inputs vector, the 2nd input should have a '2', and so on. Using the same example as before, if the 1st model's inputs need to be rearranged so that the 3rd additional input comes first, followed by the two model outputs in the same order as they are in y0, and ends with the 1st additional input, then the 1st list in used_inputs would be [2, 3, 4, 0, 1].

Updating Models

BSS ANOVA models can be updated as new data comes available. To perform this capability a few different hyperparameters can be defined for model updating methods

Hyper Paramter	Description	Necessary to define?
update	Removes variable selection functionality to allow for future updates of models	Yes
sigsqd0	initial sigma squared guess	Yes
burn	How many draws to remove from prior betas model before new fitting	No, sets to 500
built	Boolean for if model has been previously built	Yes

Once the proper parameters are in place, models can be updated with each successive calling of model.fit and redefining of the inputs and data. See update Sigmoid Example Problem for an example

Benchmarks and Papers

As mentioned in About FoKL, the primary advantage offered by FoKL in comparison to other machine learning packages is a significant decrease in computation time for training a model while not experiencing a significant decrease in accuracy. This holds true for most datasets but especially for those with an underlying static or dynamic relationship as is often the case in any physical science experiment.

The following paper outlines the methodology of FoKL and includes two example problems.

Fast variable selection makes Karhunen-Loève decomposed Gaussian process BSS-ANOVA a speedy and accurate choice for dynamic systems identification

The two example problems are:

‘Susceptible, Infected, Recovered’ (SIR) toy problem
‘Cascaded Tanks’ experimental dataset for a benchmark

Future Development

FoKL-GPy is actively in development. Current focus is on:

Pyomo
optimization of code and integration with faster C++ routines
adding examples for better comparisons and benchmarks
more robust tutorials

Please reach out via the information in the Contact Us section with any suggestions for development.

Contact Us

Topic	Point of Contact	Email
Installation Troubleshooting Development	Nam Khuu	ntk00002@mix.wvu.edu
Research Theory Other	David Mebane	david.mebane@mail.wvu.edu

License

FoKL-GPy has an MIT license. Please see the LICENSE file.

Citations

Please cite: K. Hayes, M.W. Fouts, A. Baheri and D.S. Mebane, "Forward variable selection enables fast and accurate dynamic system identification with Karhunen-Loève decomposed Gaussian processes", PLoS ONE 19(9): e0309661.

Credits: David Mebane (ideas and original code), Kyle Hayes (integrator), Derek Slack (Python porting), Jacob Krell (Python v3 dev.)

Funding provided by National Science Foundation, Award No. 2119688

Project details

Release history Release notifications | RSS feed

3.4.4

Nov 19, 2024

This version

3.4.3

Nov 19, 2024

3.4.2

Oct 29, 2024

3.4.1

Oct 21, 2024

3.4.0

Sep 4, 2024

3.3.0

Jun 24, 2024

3.2.4

May 10, 2024

3.2.3

May 9, 2024

3.2.2

Apr 16, 2024

3.2.1

Mar 19, 2024

3.2.0

Mar 13, 2024

3.1.1

Feb 12, 2024

3.1.0

Jan 31, 2024

3.0.3

Jan 5, 2024

3.0.2

Dec 30, 2023

3.0.1

Nov 19, 2023

3.0.0

Nov 10, 2023

2.0.1

Aug 11, 2023

2.0.0

Aug 11, 2023

1.2.7

Jul 11, 2023

1.2.6

Jun 12, 2023

1.2.0

Jun 9, 2023

1.1.3

Jun 9, 2023

1.1.2

Jun 9, 2023

1.1.1

Jun 9, 2023

1.1.0

Jun 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fokl-3.4.3.tar.gz (23.6 MB view details)

Uploaded Nov 19, 2024 Source

Built Distribution

FoKL-3.4.3-py3-none-any.whl (11.0 MB view details)

Uploaded Nov 19, 2024 Python 3

File details

Details for the file fokl-3.4.3.tar.gz.

File metadata

Download URL: fokl-3.4.3.tar.gz
Upload date: Nov 19, 2024
Size: 23.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fokl-3.4.3.tar.gz
Algorithm	Hash digest
SHA256	`e0527e0663a50e08928d4a2df8206d0872aa2d03f2183683d4326ecc9ed7390c`
MD5	`aca57b71f746419a8dc138a8adf50cb1`
BLAKE2b-256	`2971a80cb67a3eee0019251baa6f275d46b8d53574bc40b35d116cf221e8889f`

See more details on using hashes here.

File details

Details for the file FoKL-3.4.3-py3-none-any.whl.

File metadata

Download URL: FoKL-3.4.3-py3-none-any.whl
Upload date: Nov 19, 2024
Size: 11.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for FoKL-3.4.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5c87a8d0434b8644ed2c83ff2252f6a94aae627cdd143a4e153e0f1f165374e1`
MD5	`65955e168b0efe1d6dd4a440fa4445fb`
BLAKE2b-256	`d210079d1442ed82300e048cc0c1ee36e967a04fd2d7f4cac1282452a157cc5e`

See more details on using hashes here.

FoKL 3.4.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Contents

About FoKL

Installation and Setup

Use Cases

User Documentation

FoKLRoutines

load

FoKL

clean

generate_trainlog

trainset

bss_derivatives

evaluate_basis

evaluate

coverage3

fit

clear

to_pyomo

save

fokl_to_pyomo

getKernels

GP_integrate

Updating Models

Benchmarks and Papers

Future Development

Contact Us

License

Citations

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes