midterm: done 1, 2.1b, 2.1c, 2.1e, 2.2-2.5

This commit is contained in:
Claudio Maggioni 2021-05-08 14:19:37 +02:00
parent 4af597e8c4
commit 814dcdeb13
2 changed files with 52 additions and 14 deletions

View file

@ -148,30 +148,50 @@ monotonically decreases.
## Point 1 ## Point 1
### (a) For which kind of minimization problems can the trust region method be used? What are the assumptions on the objective function? ### (a) For which kind of minimization problems can the trust region method be
used? What are the assumptions on the objective function?
**TBD**
### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term. ### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term.
$$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$ $$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$
$\Delta$ is the trust region radius. Here's an explaination of the meaning of each term:
$p$ is the trust region step.
$g$ is the gradient at the current iterate $x_k$. - $\Delta$ is the trust region radius, i.e. an upper bound on the step's norm
$B$ is the hessian at the current iterate $x_k$. (length);
- $f$ is the energy function value at the current iterate, i.e. $f(x_k)$;
- $p$ is the trust region step, the solution of $\arg\min_p m(p)$ with $\|p\| <
\Delta$ is the optimal step to take;
- $g$ is the gradient at the current iterate $x_k$, i.e. $\nabla f(x_k)$;
- $B$ is the hessian at the current iterate $x_k$, i.e. $\nabla^2 f(x_k)$.
### (c) What is the role of the trust region radius? ### (c) What is the role of the trust region radius?
Limit confidence of model. I.e. it makes the model refrain from taking wide The role of the trust region radius is to put an upper bound on the step length
quadratic steps when the quadratic model is considerably different from the real in order to avoid "overly ambitious" steps, i.e. steps where the the step length
objective function. is considerably long and the quadratic model of the objective is low-quality
(i.e. the quadratic model differs by a predetermined approximation threshold
from the real objective).
In layman's terms, the trust region radius makes the method switch more gradient
based or more quadratic based steps w.r.t. the confidence in the quadratic
approximation.
### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them. ### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them.
Cauchy point provides sufficient decrease, but makes method like linear method. **TBD**
Dogleg method allows for mixing purely linear iteration and purely quadratic one **sufficient decrease TBD**
along the "dogleg" path picking the furthest point inside or on the edge of the
region. The Cauchy point provides sufficient decrease, but makes the trust region method
essentially like linear method.
The dogleg method allows for mixing purely linear iterations and purely quadratic
ones. The dogleg method picks along its "dog leg shaped" path function made out
of a gradient component and a component directed towards a purely Newton step
picking the furthest point that is still inside the trust region radius.
Dogleg uses cauchy point if the trust region does not allow for a proper dogleg Dogleg uses cauchy point if the trust region does not allow for a proper dogleg
step since it is too slow. step since it is too slow.
@ -182,12 +202,26 @@ Cauchy provides linear convergence and dogleg superlinear.
$$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$ $$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$
Real decrease over predicted decrease The trust region ratio measures the quality of the quadratic model built around
the current iterate $x_k$, by measuring the ratio between the energy difference
between the old and the new iterate according to the real energy function and
according to the quadratic model around $x_k$.
Test "goodness" of model. The ratio is used to test the adequacy of the current trust region radius. For
an inaccurate quadratic model, the predicted energy decrease would be
considerably higher than the effective one and thus the ratio would be low. When
the ratio is lower than a predetermined threshold ($\frac14$ is the one chosen
by Nocedal) the trust region radius is divided by 4. Instead, a very accurate
quadratic model would result in little difference with the real energy function
and thus the ratio would be close to $1$. If the trust region radius is higher
than a certain predetermined threshold ($\frac34$ is the one chosen by Nocedal),
then the trust region radius is doubled in order to allow for longer steps,
since the model quality is good.
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer. ### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
**TBD**
## Point 2 ## Point 2
The trust region algorithm is the following: The trust region algorithm is the following:
@ -335,3 +369,7 @@ Finally, we notice that TR is the only method to have neighbouring iterations
having the exact same norm: this is probably due to some proposed iterations having the exact same norm: this is probably due to some proposed iterations
steps not being validated by the acceptance criteria, which makes the method mot steps not being validated by the acceptance criteria, which makes the method mot
move for some iterations. move for some iterations.
# Exercise 3
**TBD**