Smooth data fitting in N dimensions
Project description
Given experimental data, it is often desirable to produce a function whose values match the data to some degree. This package implements a robust approach to data fitting based on the minimization problem
(A similar idea is used in for data smoothing in signal processing; see, e.g., section 8.3 in this document.)
Unlike polynomial regression or GaussNewton, smoothfit makes no assumptions about the function other than that it is smooth.
The generality of the approach makes it suitable for function whose domain is multidimensional, too.
Pics or it didn't happen
Runge's example
import matplotlib.pyplot as plt import numpy import smoothfit a = 1.5 b = +1.5 # plot original function x = numpy.linspace(a, b, 201) plt.plot(x, 1 / (1 + 25 * x ** 2), "", color="0.8", label="1 / (1 + 25 * x**2)") # 21 sample points x0 = numpy.linspace(1.0, 1.0, 21) y0 = 1 / (1 + 25 * x0 ** 2) plt.plot(x0, y0, "xk") u = smoothfit.fit1d(x0, y0, a, b, 1000, degree=1, lmbda=1.0e6) x = numpy.linspace(a, b, 201) vals = [u(xx) for xx in x] plt.plot(x, vals, "", label="smooth fit") plt.ylim(0.1) plt.grid() plt.show()
Runge's example function is a tough nut for classical polynomial regression.
If there is no noise in the input data, the parameter lmbda
can be chosen quite small
such that all data points are approximated well. Note that there are no oscillations
in the output function u
.
Runge's example with noise
lmbda = 0.001 
lmbda = 0.05 
lmbda = 0.2 
import matplotlib.pyplot as plt import numpy import smoothfit a = 1.5 b = +1.5 # plot original function x = numpy.linspace(a, b, 201) plt.plot(x, 1 / (1 + 25 * x ** 2), "", color="0.8", label="1 / (1 + 25 * x**2)") # 21 sample points numpy.random.seed(0) n = 51 x0 = numpy.linspace(1.0, 1.0, n) y0 = 1 / (1 + 25 * x0 ** 2) y0 += 1.0e1 * (2 * numpy.random.rand(n)  1) plt.plot(x0, y0, "xk") lmbda = 5.0e2 u = smoothfit.fit1d(x0, y0, a, b, 1000, degree=1, lmbda=lmbda) x = numpy.linspace(a, b, 201) vals = [u(xx) for xx in x] plt.plot(x, vals, "", label="smooth fit") plt.grid() plt.show()
If the data is noisy, lmbda
needs to be chosen more carefully. If too small, the
approximation tries to resolve all data points, resulting in many small oscillations.
If it's chosen too large, no details are resolved, not even those of the underlying
data.
Few samples
import numpy import smoothfit x0 = numpy.array([0.038, 0.194, 0.425, 0.626, 1.253, 2.500, 3.740]) y0 = numpy.array([0.050, 0.127, 0.094, 0.2122, 0.2729, 0.2665, 0.3317]) u = smoothfit.fit1d(x0, y0, 0, 4, 1000, degree=1, lmbda=1.0)
Some noisy example data taken from Wikipedia.
A twodimensional example
import meshzoo import numpy import smoothfit n = 200 numpy.random.seed(123) x0 = numpy.random.rand(n, 2)  0.5 y0 = numpy.cos(numpy.pi * numpy.sqrt(x0.T[0] ** 2 + x0.T[1] ** 2)) # create a triangle mesh for the square points, cells = meshzoo.rectangle(1.0, 1.0, 1.0, 1.0, 32, 32) u = smoothfit.fit2d(x0, y0, points, cells, lmbda=1.0e4, solver="densedirect") # Write the function to a file from dolfin import XDMFFile xdmf = XDMFFile("temp.xdmf") xdmf.write(u)
This example approximates a function from R^{2} to R (without noise in the
samples). Note that the absence of noise the data allows us to pick a rather small
lmbda
such that all sample points are approximated well.
Comparison with other approaches
Polynomial fitting/regression
The classical approach to data fitting is polynomial regression. Polynomials are chosen because they are very simple, can be evaluated quickly, and can be made to fit any function very closely.
There are, however, some fundamental problems:
 Your data might not actually fit a polynomial of low degree.
 Runge's phenomenon.
This above plot highlights the problem with oscillations.
Fourier smoothing
One approach to data fitting with smoothing is to create a function with all data points, and simply cut off the high frequencies after Fourier transformation.
This approach is fast, but only works for evenly spaced samples.
import matplotlib.pyplot as plt import numpy numpy.random.seed(0) # original function x0 = numpy.linspace(1.0, 1.0, 1000) y0 = 1 / (1 + 25 * x0 ** 2) plt.plot(x0, y0, color="k", alpha=0.2) # create sample points n = 51 x1 = numpy.linspace(1.0, 1.0, n) # only works if samples are evenly spaced y1 = 1 / (1 + 25 * x1 ** 2) + 1.0e1 * (2 * numpy.random.rand(x1.shape[0])  1) plt.plot(x1, y1, "xk") # Cut off the high frequencies in the transformed space and transform back X = numpy.fft.rfft(y1) X[5:] = 0.0 y2 = numpy.fft.irfft(X, n) # plt.plot(x1, y2, "", label="5 lowest frequencies") plt.grid() plt.show()
Testing
To run the smoothfit unit tests, check out this repository and type
pytest
License
smoothfit is published under the GPLv3+ license.
Project details
Release history Release notifications
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size  File type  Python version  Upload date  Hashes 

Filename, size smoothfit0.2.0py3noneany.whl (7.3 kB)  File type Wheel  Python version py3  Upload date  Hashes View 
Filename, size smoothfit0.2.0.tar.gz (9.2 kB)  File type Source  Python version None  Upload date  Hashes View 
Hashes for smoothfit0.2.0py3noneany.whl
Algorithm  Hash digest  

SHA256  505b67b5aa122744d36915903c3b4767365860caaac6d7f9d0e55ee52a617b01 

MD5  26afa3a95e97c4cbfecf515ff7c6191d 

BLAKE2256  c5451806d3b8a12251fdd37f4c3cd21eff170d275022f31925226bfb34bbbb67 