diff --git a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md index d10975b..620c680 100644 --- a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md +++ b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md @@ -148,30 +148,50 @@ monotonically decreases. ## Point 1 -### (a) For which kind of minimization problems can the trust region method be used? What are the assumptions on the objective function? +### (a) For which kind of minimization problems can the trust region method be +used? What are the assumptions on the objective function? + +**TBD** ### (b) Write down the quadratic model around a current iterate xk and explain the meaning of each term. $$m(p) = f + g^T p + \frac12 p^T B p \;\; \text{ s.t. } \|p\| < \Delta$$ -$\Delta$ is the trust region radius. -$p$ is the trust region step. -$g$ is the gradient at the current iterate $x_k$. -$B$ is the hessian at the current iterate $x_k$. +Here's an explaination of the meaning of each term: + +- $\Delta$ is the trust region radius, i.e. an upper bound on the step's norm + (length); +- $f$ is the energy function value at the current iterate, i.e. $f(x_k)$; +- $p$ is the trust region step, the solution of $\arg\min_p m(p)$ with $\|p\| < + \Delta$ is the optimal step to take; +- $g$ is the gradient at the current iterate $x_k$, i.e. $\nabla f(x_k)$; +- $B$ is the hessian at the current iterate $x_k$, i.e. $\nabla^2 f(x_k)$. ### (c) What is the role of the trust region radius? -Limit confidence of model. I.e. it makes the model refrain from taking wide -quadratic steps when the quadratic model is considerably different from the real -objective function. +The role of the trust region radius is to put an upper bound on the step length +in order to avoid "overly ambitious" steps, i.e. steps where the the step length +is considerably long and the quadratic model of the objective is low-quality +(i.e. the quadratic model differs by a predetermined approximation threshold +from the real objective). + +In layman's terms, the trust region radius makes the method switch more gradient +based or more quadratic based steps w.r.t. the confidence in the quadratic +approximation. ### (d) Explain Cauchy point, sufficient decrease and Dogleg method, and the connection between them. -Cauchy point provides sufficient decrease, but makes method like linear method. +**TBD** -Dogleg method allows for mixing purely linear iteration and purely quadratic one -along the "dogleg" path picking the furthest point inside or on the edge of the -region. +**sufficient decrease TBD** + +The Cauchy point provides sufficient decrease, but makes the trust region method +essentially like linear method. + +The dogleg method allows for mixing purely linear iterations and purely quadratic +ones. The dogleg method picks along its "dog leg shaped" path function made out +of a gradient component and a component directed towards a purely Newton step +picking the furthest point that is still inside the trust region radius. Dogleg uses cauchy point if the trust region does not allow for a proper dogleg step since it is too slow. @@ -182,12 +202,26 @@ Cauchy provides linear convergence and dogleg superlinear. $$\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$$ -Real decrease over predicted decrease +The trust region ratio measures the quality of the quadratic model built around +the current iterate $x_k$, by measuring the ratio between the energy difference +between the old and the new iterate according to the real energy function and +according to the quadratic model around $x_k$. -Test "goodness" of model. +The ratio is used to test the adequacy of the current trust region radius. For +an inaccurate quadratic model, the predicted energy decrease would be +considerably higher than the effective one and thus the ratio would be low. When +the ratio is lower than a predetermined threshold ($\frac14$ is the one chosen +by Nocedal) the trust region radius is divided by 4. Instead, a very accurate +quadratic model would result in little difference with the real energy function +and thus the ratio would be close to $1$. If the trust region radius is higher +than a certain predetermined threshold ($\frac34$ is the one chosen by Nocedal), +then the trust region radius is doubled in order to allow for longer steps, +since the model quality is good. ### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer. +**TBD** + ## Point 2 The trust region algorithm is the following: @@ -335,3 +369,7 @@ Finally, we notice that TR is the only method to have neighbouring iterations having the exact same norm: this is probably due to some proposed iterations steps not being validated by the acceptance criteria, which makes the method mot move for some iterations. + +# Exercise 3 + +**TBD** diff --git a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf index db1d347..3205676 100644 Binary files a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf and b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf differ