midterm: done 1, 2.1b, 2.1c, 2.1e, 2.2-2.5
This commit is contained in:
parent
4af597e8c4
commit
814dcdeb13
2 changed files with 52 additions and 14 deletions
|
@ -148,30 +148,50 @@ monotonically decreases.
|
||||||
|
|
||||||
## Point 1
|
## Point 1
|
||||||
|
|
||||||
### (a) For which kind of minimization problems can the trust region method be used? What are the assumptions on the objective function?
|
### (a) For which kind of minimization problems can the trust region method be
|
||||||
|
used? What are the assumptions on the objective function?
|
||||||
|
|
||||||
|
**TBD**
|
||||||
|
|
||||||
### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term.
|
### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term.
|
||||||
|
|
||||||
$$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$
|
$$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$
|
||||||
|
|
||||||
$\Delta$ is the trust region radius.
|
Here's an explaination of the meaning of each term:
|
||||||
$p$ is the trust region step.
|
|
||||||
$g$ is the gradient at the current iterate $x_k$.
|
- $\Delta$ is the trust region radius, i.e. an upper bound on the step's norm
|
||||||
$B$ is the hessian at the current iterate $x_k$.
|
(length);
|
||||||
|
- $f$ is the energy function value at the current iterate, i.e. $f(x_k)$;
|
||||||
|
- $p$ is the trust region step, the solution of $\arg\min_p m(p)$ with $\|p\| <
|
||||||
|
\Delta$ is the optimal step to take;
|
||||||
|
- $g$ is the gradient at the current iterate $x_k$, i.e. $\nabla f(x_k)$;
|
||||||
|
- $B$ is the hessian at the current iterate $x_k$, i.e. $\nabla^2 f(x_k)$.
|
||||||
|
|
||||||
### (c) What is the role of the trust region radius?
|
### (c) What is the role of the trust region radius?
|
||||||
|
|
||||||
Limit confidence of model. I.e. it makes the model refrain from taking wide
|
The role of the trust region radius is to put an upper bound on the step length
|
||||||
quadratic steps when the quadratic model is considerably different from the real
|
in order to avoid "overly ambitious" steps, i.e. steps where the the step length
|
||||||
objective function.
|
is considerably long and the quadratic model of the objective is low-quality
|
||||||
|
(i.e. the quadratic model differs by a predetermined approximation threshold
|
||||||
|
from the real objective).
|
||||||
|
|
||||||
|
In layman's terms, the trust region radius makes the method switch more gradient
|
||||||
|
based or more quadratic based steps w.r.t. the confidence in the quadratic
|
||||||
|
approximation.
|
||||||
|
|
||||||
### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them.
|
### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them.
|
||||||
|
|
||||||
Cauchy point provides sufficient decrease, but makes method like linear method.
|
**TBD**
|
||||||
|
|
||||||
Dogleg method allows for mixing purely linear iteration and purely quadratic one
|
**sufficient decrease TBD**
|
||||||
along the "dogleg" path picking the furthest point inside or on the edge of the
|
|
||||||
region.
|
The Cauchy point provides sufficient decrease, but makes the trust region method
|
||||||
|
essentially like linear method.
|
||||||
|
|
||||||
|
The dogleg method allows for mixing purely linear iterations and purely quadratic
|
||||||
|
ones. The dogleg method picks along its "dog leg shaped" path function made out
|
||||||
|
of a gradient component and a component directed towards a purely Newton step
|
||||||
|
picking the furthest point that is still inside the trust region radius.
|
||||||
|
|
||||||
Dogleg uses cauchy point if the trust region does not allow for a proper dogleg
|
Dogleg uses cauchy point if the trust region does not allow for a proper dogleg
|
||||||
step since it is too slow.
|
step since it is too slow.
|
||||||
|
@ -182,12 +202,26 @@ Cauchy provides linear convergence and dogleg superlinear.
|
||||||
|
|
||||||
$$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$
|
$$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$
|
||||||
|
|
||||||
Real decrease over predicted decrease
|
The trust region ratio measures the quality of the quadratic model built around
|
||||||
|
the current iterate $x_k$, by measuring the ratio between the energy difference
|
||||||
|
between the old and the new iterate according to the real energy function and
|
||||||
|
according to the quadratic model around $x_k$.
|
||||||
|
|
||||||
Test "goodness" of model.
|
The ratio is used to test the adequacy of the current trust region radius. For
|
||||||
|
an inaccurate quadratic model, the predicted energy decrease would be
|
||||||
|
considerably higher than the effective one and thus the ratio would be low. When
|
||||||
|
the ratio is lower than a predetermined threshold ($\frac14$ is the one chosen
|
||||||
|
by Nocedal) the trust region radius is divided by 4. Instead, a very accurate
|
||||||
|
quadratic model would result in little difference with the real energy function
|
||||||
|
and thus the ratio would be close to $1$. If the trust region radius is higher
|
||||||
|
than a certain predetermined threshold ($\frac34$ is the one chosen by Nocedal),
|
||||||
|
then the trust region radius is doubled in order to allow for longer steps,
|
||||||
|
since the model quality is good.
|
||||||
|
|
||||||
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
|
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
|
||||||
|
|
||||||
|
**TBD**
|
||||||
|
|
||||||
## Point 2
|
## Point 2
|
||||||
|
|
||||||
The trust region algorithm is the following:
|
The trust region algorithm is the following:
|
||||||
|
@ -335,3 +369,7 @@ Finally, we notice that TR is the only method to have neighbouring iterations
|
||||||
having the exact same norm: this is probably due to some proposed iterations
|
having the exact same norm: this is probably due to some proposed iterations
|
||||||
steps not being validated by the acceptance criteria, which makes the method mot
|
steps not being validated by the acceptance criteria, which makes the method mot
|
||||||
move for some iterations.
|
move for some iterations.
|
||||||
|
|
||||||
|
# Exercise 3
|
||||||
|
|
||||||
|
**TBD**
|
||||||
|
|
Binary file not shown.
Reference in a new issue