Solving Nonlinear Least Squares Problems Using the Gauß-Newton Method
The example in this notebook is based on a measurement series taken from [1].
[1] A. Croeze, L. Pittman, andW. Reynolds. “Solving nonlinear least-squares problems with the Gauss-Newton and Levenberg-Marquardt methods”. en. In: (2012), p. 16.
Introduction
In this notebook we want to use the Gauß-Newton method to perform the following model fitting process serving as an example for a nonlinear least squares problem:
We use population data in the United states between 1815 and 1885 [1], as shown in the following Table. The period of time is mapped to the interval \([1,8]\) to simplify further computations. The measured values \(y_i\) are the average number of citizens in the United states in the corresponding year.
year | 1815 | 1825 | 1835 | 1845 | 1855 | 1865 | 1875 | 1885 |
---|---|---|---|---|---|---|---|---|
t | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
y | 8.3 | 11.0 | 14.7 | 19.7 | 26.7 | 35.2 | 44.4 | 55.9 |
The following function is utilized to model the given data points:
where \(\varphi(x,t)\) describes the modeled number of citizens of the United States. The time points \(t\) are given by mapping the year values as described above. The vector \(x\in\mathbb{R}^2\) holds the two parameters that are used to fit the model.
import numpy as np
# measured values
time = [1, 2, 3, 4, 5, 6, 7, 8]
measured = [8.3, 11.0, 14.7, 19.7, 26.7, 35.2, 44.4, 55.9]
# model and its derivative
def model(x, t):
return x[0]*np.exp(x[1]*t)
def dmodel(x, t):
derivation = np.array([np.exp(x[1]*t),
t*x[0]*np.exp(x[1]*t)])
return derivation
Model Fitting via oppy
Preparation: Building the Sum of Squared Residuals (SSR)
As explained in the notebook about least-squares problems, minimizing the sum of squared residuals (SSR)
can be used to fit a model to given data points. In our case, the following SSR is set up:
The Gauß-Newton method takes not take the complete SSR as an input, but the residual vector \(r(x)\) (and its derivative):
# define residuals (one can use the costfunctions data_least_square and
# gradient_data_least_square integrated in oppy to reduce coding lines)
from oppy.tests.costfunctions import data_least_square, gradient_data_least_square
def f_fun(x):
return data_least_square(x, model, time, measured)
def df_fun(x):
return gradient_data_least_square(x, dmodel, time, measured)
Applying the Gauß-Netwon Method
As a first step, a starting point for the iteration process is set. In addition, further imports are performed that are necessary for the application of the gauss_newton() method.
# import Gauß-Newton method and the options file
from oppy.leastSquares import nonlinear_least_square
from oppy.options import Options
# starting point:
x0 = np.array([2.5, 0.25])
a) Using the Gauß-Netwon Method with its Default Setting
In the following line, the Gauss-Newton method is applied. All optional values are left to their default.
# apply the Gauß-Newton method with its default setting
# in the options the Gauss-Newton method can be chosen:
options = Options(method='gauss_newton')
result = nonlinear_least_square(f_fun, df_fun, x0, options)
# print the results
print(result)
iterations: 6
res: array([1.15448714e+04, 1.18918917e+04, 7.71959039e+02, 3.68152203e+00,
3.08907754e-02, 1.17316561e-03, 3.79220090e-05])
x: array([7.00015188, 0.26207664])
b) Using the Gauß-Netwon Method with Further Options
Alternatively, some options can be set (for a complete list of all options, see the method’s documentation):
options_gauss_newton = Options()
options_gauss_newton.method = 'gauss_newton'
options_gauss_newton.nmax = 100
options_gauss_newton.tol_rel = 1e-6
options_gauss_newton.tol_abs = 1e-6
# apply the Gauß-Newton method with further options
result_options = nonlinear_least_square(f_fun, df_fun, x0, options_gauss_newton)
# print the results
print(result_options)
iterations: 5
res: array([1.15448714e+04, 1.18918917e+04, 7.71959039e+02, 3.68152203e+00,
3.08907754e-02, 1.17316561e-03])
x: array([7.00015461, 0.26207658])
c) Using the Levenberg-Marquardt Variant of the Gauß-Netwon Method
There also exists a variant of the Gauss-Newton method implemented in the nonlinear_least_square() function, which follows a simple version of the Levenberg-Marquardt algorithm. This variant can be selected via the options:
options_levenberg_marquardt = Options(method='levenberg_marquardt')
# apply the Levenberg-Marquardt variant of the Gauß-Newton method
result_lev = nonlinear_least_square(f_fun, df_fun, x0, options_levenberg_marquardt)
# print the results
print(result_lev)
iterations: 6
res: array([1.15448714e+04, 1.20201297e+04, 7.85136012e+02, 3.80880879e+00,
3.11472615e-02, 1.18009947e-03, 3.80636286e-05])
x: array([7.00015188, 0.26207664])