midterm: done 1, 2.1b, 2.1c, 2.1e, 2.2-2.5

2021-05-08 14:19:37 +02:00 · 2021-05-08 14:19:37 +02:00 · 814dcdeb13
commit 814dcdeb13
parent 4af597e8c4
2 changed files with 52 additions and 14 deletions
--- a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md
+++ b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md
@ -148,30 +148,50 @@ monotonically decreases.

 ## Point 1

-### (a) For which kind of minimization problems can the trust region method be used? What are the assumptions on the objective function?
+### (a) For which kind of minimization problems can the trust region method be
+used? What are the assumptions on the objective function?
+
+**TBD**

 ### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term.

 $$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$

-$\Delta$ is the trust region radius.
-$p$ is the trust region step.
-$g$ is the gradient at the current iterate $x_k$.
-$B$ is the hessian at the current iterate $x_k$.
+Here's an explaination of the meaning of each term:
+
+- $\Delta$ is the trust region radius, i.e. an upper bound on the step's norm
+  (length);
+- $f$ is the energy function value at the current iterate, i.e. $f(x_k)$;
+- $p$ is the trust region step, the solution of $\arg\min_p m(p)$ with $\|p\| <
+  \Delta$ is the optimal step to take;
+- $g$ is the gradient at the current iterate $x_k$, i.e. $\nabla f(x_k)$;
+- $B$ is the hessian at the current iterate $x_k$, i.e. $\nabla^2 f(x_k)$.

 ### (c) What is the role of the trust region radius?

-Limit confidence of model. I.e. it makes the model refrain from taking wide
-quadratic steps when the quadratic model is considerably different from the real
-objective function.
+The role of the trust region radius is to put an upper bound on the step length
+in order to avoid "overly ambitious" steps, i.e. steps where the the step length
+is considerably long and the quadratic model of the objective is low-quality
+(i.e. the quadratic model differs by a predetermined approximation threshold
+from the real objective).
+
+In layman's terms, the trust region radius makes the method switch more gradient
+based or more quadratic based steps w.r.t. the confidence in the quadratic
+approximation.

 ### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them.

-Cauchy point provides sufficient decrease, but makes method like linear method.
+**TBD**

-Dogleg method allows for mixing purely linear iteration and purely quadratic one
-along the "dogleg" path picking the furthest point inside or on the edge of the
-region.
+**sufficient decrease TBD**
+
+The Cauchy point provides sufficient decrease, but makes the trust region method
+essentially like linear method.
+
+The dogleg method allows for mixing purely linear iterations and purely quadratic
+ones. The dogleg method picks along its "dog leg shaped" path function made out
+of a gradient component and a component directed towards a purely Newton step
+picking the furthest point that is still inside the trust region radius.

 Dogleg uses cauchy point if the trust region does not allow for a proper dogleg
 step since it is too slow.
@ -182,12 +202,26 @@ Cauchy provides linear convergence and dogleg superlinear.

 $$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$

-Real decrease over predicted decrease
+The trust region ratio measures the quality of the quadratic model built around
+the current iterate $x_k$, by measuring the ratio between the energy difference
+between the old and the new iterate according to the real energy function and
+according to the quadratic model around $x_k$.

-Test "goodness" of model.
+The ratio is used to test the adequacy of the current trust region radius.  For
+an inaccurate quadratic model, the predicted energy decrease would be
+considerably higher than the effective one and thus the ratio would be low. When
+the ratio is lower than a predetermined threshold ($\frac14$ is the one chosen
+by Nocedal) the trust region radius is divided by 4. Instead, a very accurate
+quadratic model would result in little difference with the real energy function
+and thus the ratio would be close to $1$. If the trust region radius is higher
+than a certain predetermined threshold ($\frac34$ is the one chosen by Nocedal),
+then the trust region radius is doubled in order to allow for longer steps,
+since the model quality is good.

 ### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.

+**TBD**
+
 ## Point 2

 The trust region algorithm is the following:
@ -335,3 +369,7 @@ Finally, we notice that TR is the only method to have neighbouring iterations
 having the exact same norm: this is probably due to some proposed iterations
 steps not being validated by the acceptance criteria, which makes the method mot
 move for some iterations.
+
+# Exercise 3
+
+**TBD**
--- a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf
+++ b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf