Least Squares Optimization
leastSquares: Least Squares
Subpackage which provide some methods for linear and nonlinear least squares problems, e.g:
and
for an objective function \(f\) having the special form \(f(x)=\frac{1}{2}\sum\limits_{j=1}^m r_j^2(x)=\frac{1}{2}\|r(x)\|_2^2\).
Right now we can solve this kind of problems with the following methods.
-
- Linear Least Squares
-
- linear least squares (solving normal equation)
-
- Nonlinear Least Squares
-
- Gauss-Newton algorithm with several choices.
- Levenberg-Marquardt with line-search or trust-region algorithm.
Documentation is available in the docstrings and online here.
Methods
oppy.leastSquares.linear_least_square module
Linear Least Squares Algorithm.
This Module contains the function linear_least_square()
and all
helper functions to solve the problem of finding
a solution of the linear least squares problem, e.g. solve:
-
oppy.leastSquares.linear_least_square.
linear_least_square
(A, b, options=None) -
Linear Least Squares.
This function solves the normal equation, e.g. a linear optimization problem of the form
\[\Vert Ax' - b \Vert_2 = \min \Vert Ax - b \Vert_2\]by using either Cholesky decomposition, QR decomposition or singular value decomposition. Matrices in sparse formate are not supported. In case you need a least squares function for sparse matrices anyway
scipy.sparse.linalg.lsqrmay be the desired function.
Parameters: -
A (
numpy.ndarray
, shape (m,n)) – Matrix A of the least squares problem. -
b (
numpy.ndarray
, shape (m,)) – RHS b of the least squares problem. -
options (
Options
, optional) –Options for simplex, represented as a
Options
object. The default isNone
. Possible options are:-
- disp
bool
, optional - If
disp
is True the method will show some results during the process. The default is False.
- disp
-
- fargs
str
, optional - Fargs containg information about decomposition method, if
fargs=None then the QR decomposition is performed
'chol'
Cholesky decomposition'QR'
QR decomposition'svd'
sigular value decomposition. The default is None.
- fargs
-
Returns: x – Solution of the least squares problem.
Return type: numpy.ndarray
, shape (n,)Examples
In this example, a problem with a large dense matrix is solved.
>>> import numpy as np >>> from oppy.leastSquares import linear_least_square >>> m = 500 >>> n = 50 >>> A = np.random.rand(m, n) >>> b = np.random.rand(m) >>> x = linear_least_square(A, b) >>> print(np.linalg.norm((A.dot(x) - b))) # may vary >>> 6.143064057239379
References
W. Dahmen, A. Reusken. Numerik für Ingenieure und Naturwissenschaftler, see Dahmen and Reusken [DR06].
- Luik. “Numerik 1”. Lecture Notes, see Luik [Lui10].
-
A (
-
oppy.leastSquares.linear_least_square.
bidiagonal_decomposition
(A) -
Determine the bidiagonal decomposition of a matrix.
This function determines the bidiagonal decomposition of A s.t.
\[A = QL * B * QR.\]Parameters: A ( numpy.ndarray
, shape (m,n)) – matrix A.Returns: res – Contains [QL, A, QR] in a list. Return type: list
-
oppy.leastSquares.linear_least_square.
singular_value_decomposition
(A) -
Calculate the singular value decomposition of a matrix A.
This function calculates the singular value decomposition of a matrix A such that
\[U^TAV = \Sigma,\]where U and V are orthogonal matrices and Sigma is a diagonal matrix containing the singular values of A.
Parameters: A ( numpy.ndarray
, shape (m,n)) – matrix A.Returns: -
V (
numpy.ndarray
, shape (n,n)) – matrix V of SVD -
sigma (
numpy.ndarray
, shape (m,n)) – singular values -
U (
numpy.ndarray
, shape (m,m)) – matrix U of SVD
-
V (
-
oppy.leastSquares.linear_least_square.
singular_value_solution
(A, b) -
Solve lsq problem using SVD.
This function solves the linear problem
\[||Ax' - b|| = \min ||Ax - b||\]by using the singular value decomposition similar to chapter 4.7 in in [1].
Parameters: -
A (
numpy.ndarray
, shape (m,n)) – matrix of least squares problem. -
b (
numpy.ndarray
, shape (m,)) – Rhs of least squares problem.
Returns: x – Solution of the least squares problem.
Return type: numpy.ndarray
, shape (n,)References
W. Dahmen, A. Reusken. Numerik für Ingenieure und Naturwissenschaftler, see Dahmen and Reusken [DR06].
-
A (
-
oppy.leastSquares.linear_least_square.
signum
(x) -
Change the signum command.
This function changes the sign-command s.t. sign(0) = 1.
Parameters: x ( numpy.ndarray
, shape (n,)) – Input vector x.Returns: sigma – returns the sign(x). Return type: numpy.ndarray
, shape (n,)
-
oppy.leastSquares.linear_least_square.
sort
(x, A) -
Sort entries of a vector in decreasing order.
This function sorts the entries of a vector x by decreasing order and changes the columns of a further matrix in corresponding order.
Parameters: -
x (
numpy.ndarray
, shape (n,)) – input vector x. -
A (
numpy.ndarray
, shape (m,n)) – input matrix A.
Returns: -
y (
numpy.ndarray
, shape (n,)) – sorted y. -
B (
numpy.ndarray
, shape (m,n)) – sorted B.
-
x (
oppy.leastSquares.nonlinear_least_square
Nonlinear Least-Squares Algorithm.
This module contains the function nonlinear_least_square()
and all
helper functions to solve the minimization problem:
for an objective function \(f\) having the special form \(f(x)=\frac{1}{2}\sum\limits_{j=1}^m r_j^2(x)=\frac{1}{2}\|r(x)\|_2^2\).
The residual vector \(r(x)=(r_1(x),\ldots,r_m(x))^T\) contains \(m\) smooth functions \(r_j:\mathbb{R}^n\rightarrow\mathbb{R}\). Furthermore it is assumed that \(m \geq n\) is satisfied.
-
oppy.leastSquares.nonlinear_least_square.
nonlinear_least_square
(fhandle, dfhandle, x, options=None) -
Nonlinear Least-Squares Algorithm.
Parameters: -
fhandle (
callable()
) –The residual vector r. The objective function to be minimized is built of:
fhandle(x, *fargs) -> array_like, shape (m,).
where \(x\) is an 1-D array with shape (n,) and
args
is a tuple of the fixed parameters needed to completely specify the function. -
dfhandle (
callable()
) –The Jacobian matrix of the residual vector r.
dfhandle(x, *fargs) -> array_like, shape (m, n).
where x is an 1-D array with shape (n,) and
args
is a tuple of the fixed parameters needed to completely specify the function. -
x (
numpy.ndarray
, shape (n,)) – Initial guess. Array of real elements of size (n,), where ‘n’ is the number of independent variables. -
options (
Options
, optional) –Options for the nonlinear least-squares algorithm, represented as an
Options
object. The default isNone
. Possible options are:-
- method
str
- It can be chosen which method is used to solve the nonlinear
least-squares problem. The following line search based methods are
available:
gauss_newton
andlevenberg_marquardt
. The first one is the Gauss-Newton method and the latter one a Levenberg-Marquardt variant of the Gauss-Newton method. The inputlevenberg_marquardt_trust_region
applies a Levenberg-Marquardt method within a trust-region approach. The default islevenberg_marquardt_trust_region
.
- method
-
- tol_abs
float
- Absolute tolerance for the termination condition, initialized with \(10^{-7}\).
- tol_abs
-
- tol_rel
float
- Relative tolerance for the termination condition, initialized with \(10^{-7}\).
- tol_rel
Note
The optimization algorithm stops, if the norm of the current residual
res[k]
satisfies the inequality\[\mathrm{res}[k] < \mathrm{stopping\_criteria},\]where
stopping_criteria
is given by\[\mathrm{stopping\_criteria} = \mathrm{tol\_rel} \cdot \mathrm{res}[0] + \mathrm{tol\_abs},\]and
res[0]
is the norm of the first residual. If the value ofstopping_criteria
is bigger than \(0.1\), e.g. if the norm of the first residual is big and the relative tolerance is small, then this value is set to \(0.1\).-
- fargs
tuple
- Tuple containing additional arguments for fhandle.
- fargs
-
- linesearch
callable()
, - Callable function that computes a steplength. The default is
None
and therefore initialized towolfe()
.
- linesearch
-
- disp
bool
- If
disp == True
, some details about the progress will be shown during the optimization
- disp
-
- norm
callable()
- Callable function that computes the norm of the
gradient \(\nabla f\) in a vector \(x\). The default is
None
and therefore initialized to the standard norm in \(\mathbb{R}^n\), i.e.\[\mathrm{norm}: x \mapsto ||x||_2\]Note
If you want to change the norm, please remember, that you have to change the
inner_prod
function in your linesearch algorithm, such thatnorm
is computed by\[\mathrm{norm} : x \mapsto \sqrt{\mathrm{inner\_prod}(x,x)}.\]
- norm
-
- solve_linear
str
- Algorithm that is choosen for solving the linear system\[J^T(x_k)J(x_k)p = -J(x_k)^Tr(x_k) : x \mapsto \sqrt{\mathrm{inner\_prod}(x,x)}.\]
to get the search direction \(p_k\). It is initialized with
qr
(QR-factorization) (alsochol
(Cholesky factorization) andsvd
(Single Value Decomposition) are available).
- solve_linear
-
- delta
float
- Initial trust-region radius. The default is 100.
- delta
-
- delta_max
float
- Upper bound for the trust-region radius. The default is \(10^{12}\).
- delta_max
-
- eta
float
- Threshold which decides whether the step is taken. It has to be chosen in \((0,1)\). The default is 0.0001.
- eta
-
- sigma
float
- Possible deviation from the trust-region constraint. The default is 0.1.
- sigma
-
- scaling
bool
- If set to
True
then the scaling of the problem is incorporated. The default isFalse
.
- scaling
-
Returns: results – Instance of the class
Results
. It represents the optimization results. Attributes of the object depend on the choice of theoptions.results
parameter. With default choice the object has the following attributes:-
- x
numpy.ndarray
, shape (n,) -
Solution vector of the optimization problem.
- x
-
- iterations
int
-
Number of iterations the algorithm needed.
- iterations
-
- res
list
-
List with the norm of the residuals.
- res
Return type: Examples
Minimizing the Rosenbrock function.
>>> from oppy.tests.costfunctions import rosenbrock_least_square,\ ... gradient_rosenbrock_least_square >>> import numpy as np
The
nonlinear_least_square()
function is chosen to minimize the function.>>> from oppy.leastSquares.nonlinear_least_square import \ ... nonlinear_least_square
Additionaly, a starting value has to be chosen.
>>> x0 = np.array([6,3])
Now we can execute the
nonlinear_least_square()
function:>>> result = nonlinear_least_square(rosenbrock_least_square,\ ... gradient_rosenbrock_least_square, x0)) >>> print(result) iterations: 2 res: array([7.94844897e+04, 1.11803399e+04, 2.24792062e-13]) x: array([1., 1.])
References
J. Nocedal, S.J. Wright. Numerical Optimization, see Nocedal and Wright [NW06] p. 245-257.
Jorge J. Moré. “The Levenberg-Marquardt algorithm: Implementation and theory”. see Moré [Mor78].
-
fhandle (
-
oppy.leastSquares.nonlinear_least_square.
givens_qr
(R_I, n, m) -
Calculate a QR decomposition of a special matrix.
This function calculates the \(QR\) decomposition of a given \((m+n, n)\) matrix, which is already upper triangular except for \(n\) elements. The funtion is used in the implementation of the Levenberg-Marquardt method as a trust-region method. More detailed, the input matrix looks like:
\[\begin{split}\left[ \begin{array}{c} R \\ 0 \\ \sqrt{\lambda}I \end{array} \right]\end{split}\]where \(R\) is an upper triangular matrix. The algorithm performs \((n(n+1))/2\) Givens rotations to get its QR decomposition.
Parameters: -
R_I (
numpy.ndarray
, shape (m+n, n)) – The input matrix R_I is already upper triangular, except for the elements \((m+1, 1), (m+2, 2), ..., (m+n, n)\). -
n (
int
) – Number of columns of the input matrix. -
m (
int
) – Integer such that the number of rows of R_I is \(m+n\).
Returns: -
R_I (
numpy.ndarray
, shape (m+n, n)) – Rotated matrix R_I, which is now an upper triangular matrix. -
Q.T (
numpy.ndarray
, shape (m+n, m+n)) – This is the transpose of the product of all rotations, which were used to rotate the input matrix R_I.
-
R_I (
-
oppy.leastSquares.nonlinear_least_square.
givens_rotation
(A, i, j) -
Calculate one Givens rotation.
This function calculates one Givens rotation, such that a given matrix A is rotated. As a consequence, the entry A[i, j] is then zero.
Parameters: -
A (
numpy.ndarray
, shape (m+n, n)) – Input matrix which is rotated. -
i (
int
) – Row number of the entry which will be zero. -
j (
int
) – Column number of the entry which is after the rotation zero.
Returns: -
B (
numpy.ndarray
, shape (m+n, n)) – Givens rotated matrix B. -
Q (
numpy.ndarray
, shape (m+n, m+n)) – Givens rotation matrix, it is \(Q \cdot A = B\).
-
A (