Multi-variable polynomial regression for curve fitting.
Project description
Overview
MVPR is available on PyPI, and can be installed via
pip install MVPR
This package fits a multi-variable polynomial equation to a set of data using cross validation. The solution is regularised using truncated singular value decomposition of the Moore-Penrose pseudo-inverse, where the truncation point is found using a golden section search. It is suitable for ill-posed problems, and for preventing over-fitting to noise.
Example
consider a 3-D set of data, plotted as follows:
and another set:
We want to find some mapping function for the same input data. Using the MVPR code we can place the vectors
First import the data:
df= pd.read_excel(r'C:\Users\filepath\data.xlsx')
data=df.to_numpy()
df= pd.read_excel(r'C:\Users\filepath\targets.xlsx')
targets=df.to_numpy()
select the proportions of data for cross-validation
proportion_training = 0.9
num_train_samples = round(len(data[:,0])*0.8)
num_val_samples = round(len(data[:,0]))-num_train_samples
standardise:
mean_dat = data[:, :].mean(axis=0)
std_dat = data[:, :].std(axis=0)
data -= mean_dat
if 0 not in std_dat:
data[:, :] /= std_dat
training_data = data[:num_train_samples, :]
training_targets = targets[:num_train_samples, :]
validation_data = data[-num_val_samples :, :]
validation_targets = targets[-num_val_samples :, :]
call the following
M = MVP.MVPR_forward(training_data, training_targets, validation_data, validation_targets)
optimum_order = M.find_order()
coefficient_matrix = M.compute_CM(optimum_order)
predicted_validation = M.compute(coefficient_matrix, optimum_order, validation_data)
df = pd.DataFrame(predicted_validation)
df.to_excel(r'C:\Users\filepath\predicted.xlsx')
The fitted curves:
Functions and arguments
MVPR.find_order()
This function finds the optimal order of polynomial in the range 0 to 6, using cross validation.
MVPR.find_order()
This function finds the optimal order of polynomial in the range 0 to 6, using cross validation.
MVPR.compute_CM(order)
This function computes the coefficient matrix which fits a polynomial to the measured data in a least squares sense. The fit is regularised using truncated singular value decomposition, which eliminates singular values under a certain threshold. Any oder can be passed into this by the user, it does not have to have the range limited inf find_oder().
Theory
For the theory behind the code see [1].
References
[1] Hansen, P. C. (1997). Rank-deficient and Discrete Ill-posed Problems: Numerical Aspects of Linear Inversion.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.