midterm: done 1, 2.1b, 2.1c, 2.1e, 2.2-2.5
This commit is contained in:
parent
4af597e8c4
commit
814dcdeb13
2 changed files with 52 additions and 14 deletions
|
@ -148,30 +148,50 @@ monotonically decreases.
|
|||
|
||||
## Point 1
|
||||
|
||||
### (a) For which kind of minimization problems can the trust region method be used? What are the assumptions on the objective function?
|
||||
### (a) For which kind of minimization problems can the trust region method be
|
||||
used? What are the assumptions on the objective function?
|
||||
|
||||
**TBD**
|
||||
|
||||
### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term.
|
||||
|
||||
$$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$
|
||||
|
||||
$\Delta$ is the trust region radius.
|
||||
$p$ is the trust region step.
|
||||
$g$ is the gradient at the current iterate $x_k$.
|
||||
$B$ is the hessian at the current iterate $x_k$.
|
||||
Here's an explaination of the meaning of each term:
|
||||
|
||||
- $\Delta$ is the trust region radius, i.e. an upper bound on the step's norm
|
||||
(length);
|
||||
- $f$ is the energy function value at the current iterate, i.e. $f(x_k)$;
|
||||
- $p$ is the trust region step, the solution of $\arg\min_p m(p)$ with $\|p\| <
|
||||
\Delta$ is the optimal step to take;
|
||||
- $g$ is the gradient at the current iterate $x_k$, i.e. $\nabla f(x_k)$;
|
||||
- $B$ is the hessian at the current iterate $x_k$, i.e. $\nabla^2 f(x_k)$.
|
||||
|
||||
### (c) What is the role of the trust region radius?
|
||||
|
||||
Limit confidence of model. I.e. it makes the model refrain from taking wide
|
||||
quadratic steps when the quadratic model is considerably different from the real
|
||||
objective function.
|
||||
The role of the trust region radius is to put an upper bound on the step length
|
||||
in order to avoid "overly ambitious" steps, i.e. steps where the the step length
|
||||
is considerably long and the quadratic model of the objective is low-quality
|
||||
(i.e. the quadratic model differs by a predetermined approximation threshold
|
||||
from the real objective).
|
||||
|
||||
In layman's terms, the trust region radius makes the method switch more gradient
|
||||
based or more quadratic based steps w.r.t. the confidence in the quadratic
|
||||
approximation.
|
||||
|
||||
### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them.
|
||||
|
||||
Cauchy point provides sufficient decrease, but makes method like linear method.
|
||||
**TBD**
|
||||
|
||||
Dogleg method allows for mixing purely linear iteration and purely quadratic one
|
||||
along the "dogleg" path picking the furthest point inside or on the edge of the
|
||||
region.
|
||||
**sufficient decrease TBD**
|
||||
|
||||
The Cauchy point provides sufficient decrease, but makes the trust region method
|
||||
essentially like linear method.
|
||||
|
||||
The dogleg method allows for mixing purely linear iterations and purely quadratic
|
||||
ones. The dogleg method picks along its "dog leg shaped" path function made out
|
||||
of a gradient component and a component directed towards a purely Newton step
|
||||
picking the furthest point that is still inside the trust region radius.
|
||||
|
||||
Dogleg uses cauchy point if the trust region does not allow for a proper dogleg
|
||||
step since it is too slow.
|
||||
|
@ -182,12 +202,26 @@ Cauchy provides linear convergence and dogleg superlinear.
|
|||
|
||||
$$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$
|
||||
|
||||
Real decrease over predicted decrease
|
||||
The trust region ratio measures the quality of the quadratic model built around
|
||||
the current iterate $x_k$, by measuring the ratio between the energy difference
|
||||
between the old and the new iterate according to the real energy function and
|
||||
according to the quadratic model around $x_k$.
|
||||
|
||||
Test "goodness" of model.
|
||||
The ratio is used to test the adequacy of the current trust region radius. For
|
||||
an inaccurate quadratic model, the predicted energy decrease would be
|
||||
considerably higher than the effective one and thus the ratio would be low. When
|
||||
the ratio is lower than a predetermined threshold ($\frac14$ is the one chosen
|
||||
by Nocedal) the trust region radius is divided by 4. Instead, a very accurate
|
||||
quadratic model would result in little difference with the real energy function
|
||||
and thus the ratio would be close to $1$. If the trust region radius is higher
|
||||
than a certain predetermined threshold ($\frac34$ is the one chosen by Nocedal),
|
||||
then the trust region radius is doubled in order to allow for longer steps,
|
||||
since the model quality is good.
|
||||
|
||||
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
|
||||
|
||||
**TBD**
|
||||
|
||||
## Point 2
|
||||
|
||||
The trust region algorithm is the following:
|
||||
|
@ -335,3 +369,7 @@ Finally, we notice that TR is the only method to have neighbouring iterations
|
|||
having the exact same norm: this is probably due to some proposed iterations
|
||||
steps not being validated by the acceptance criteria, which makes the method mot
|
||||
move for some iterations.
|
||||
|
||||
# Exercise 3
|
||||
|
||||
**TBD**
|
||||
|
|
Binary file not shown.
Reference in a new issue