midterm: done w report 1, 2.2-2.5
This commit is contained in:
parent
11ae8556f2
commit
92dac9b00f
5 changed files with 160 additions and 2 deletions
Binary file not shown.
Before Width: | Height: | Size: 142 KiB After Width: | Height: | Size: 132 KiB |
Binary file not shown.
Before Width: | Height: | Size: 307 KiB After Width: | Height: | Size: 140 KiB |
Binary file not shown.
Before Width: | Height: | Size: 58 KiB After Width: | Height: | Size: 35 KiB |
|
@ -1,14 +1,18 @@
|
||||||
<!-- vim: set ts=2 sw=2 et tw=80: -->
|
<!-- vim: set ts=2 sw=2 et tw=80: -->
|
||||||
|
|
||||||
---
|
---
|
||||||
|
title: Midterm -- Optimization Methods
|
||||||
|
author: Claudio Maggioni
|
||||||
header-includes:
|
header-includes:
|
||||||
- \usepackage{amsmath}
|
- \usepackage{amsmath}
|
||||||
- \usepackage{hyperref}
|
- \usepackage{hyperref}
|
||||||
- \usepackage[utf8]{inputenc}
|
- \usepackage[utf8]{inputenc}
|
||||||
- \usepackage[margin=2.5cm]{geometry}
|
- \usepackage[margin=2.5cm]{geometry}
|
||||||
|
- \usepackage[ruled,vlined]{algorithm2e}
|
||||||
|
- \usepackage{float}
|
||||||
|
- \floatplacement{figure}{H}
|
||||||
|
|
||||||
---
|
---
|
||||||
\title{Midterm -- Optimization Methods}
|
|
||||||
\author{Claudio Maggioni}
|
|
||||||
\maketitle
|
\maketitle
|
||||||
|
|
||||||
# Exercise 1
|
# Exercise 1
|
||||||
|
@ -139,3 +143,157 @@ https://en.wikipedia.org/wiki/Definite_matrix#Multiplication)
|
||||||
Thanks to this we have indeed proven that the delta $\|e_k\|_A - \|e_{k+1}\|_A$
|
Thanks to this we have indeed proven that the delta $\|e_k\|_A - \|e_{k+1}\|_A$
|
||||||
is indeed positive and thus as $i$ increases the energy norm of the error
|
is indeed positive and thus as $i$ increases the energy norm of the error
|
||||||
monotonically decreases.
|
monotonically decreases.
|
||||||
|
|
||||||
|
# Question 2
|
||||||
|
|
||||||
|
## Point 1
|
||||||
|
|
||||||
|
TBD
|
||||||
|
|
||||||
|
## Point 2
|
||||||
|
|
||||||
|
The trust region algorithm is the following:
|
||||||
|
|
||||||
|
\begin{algorithm}[H]
|
||||||
|
\SetAlgoLined
|
||||||
|
Given $\hat{\Delta} > 0, \Delta_0 \in (0,\hat{\Delta})$,
|
||||||
|
and $\eta \in [0, \frac14)$\;
|
||||||
|
|
||||||
|
\For{$k = 0, 1, 2, \ldots$}{%
|
||||||
|
Obtain $p_k$ by using Cauchy or Dogleg method\;
|
||||||
|
$\rho_k \gets \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)}$\;
|
||||||
|
\uIf{$\rho_k < \frac14$}{%
|
||||||
|
$\Delta_{k+1} \gets \frac14 \Delta_k$\;
|
||||||
|
}\Else{%
|
||||||
|
\uIf{$\rho_k > \frac34$ and $\|\rho_k\| = \Delta_k$}{%
|
||||||
|
$\Delta_{k+1} \gets \min(2\Delta_k, \hat{\Delta})$\;
|
||||||
|
}
|
||||||
|
\Else{%
|
||||||
|
$\Delta_{k+1} \gets \Delta_k$\;
|
||||||
|
}}
|
||||||
|
\uIf{$\rho_k > \eta$}{%
|
||||||
|
$x_{k+1} \gets x_k + p_k$\;
|
||||||
|
}
|
||||||
|
\Else{
|
||||||
|
$x_{k+1} \gets x_k$\;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\caption{Trust region method}
|
||||||
|
\end{algorithm}
|
||||||
|
|
||||||
|
The Cauchy point algorithm is the following:
|
||||||
|
|
||||||
|
\begin{algorithm}[H]
|
||||||
|
\SetAlgoLined
|
||||||
|
Input $B$ (quadratic term), $g$ (linear term), $\Delta_k$\;
|
||||||
|
\uIf{$g^T B g \geq 0$}{%
|
||||||
|
$\tau \gets 1$\;
|
||||||
|
}\Else{%
|
||||||
|
$\tau \gets \min(\frac{\|g\|^3}{\Delta_k \cdot g^T B g}, 1)$\;
|
||||||
|
}
|
||||||
|
|
||||||
|
$p_k \gets -\tau \cdot \frac{\Delta_k}{\|g\|^2 \cdot g}$\;
|
||||||
|
\Return{$p_k$}
|
||||||
|
\caption{Cauchy point}
|
||||||
|
\end{algorithm}
|
||||||
|
|
||||||
|
Finally, the Dogleg method algorithm is the following:
|
||||||
|
|
||||||
|
\begin{algorithm}[H]
|
||||||
|
\SetAlgoLined
|
||||||
|
Input $B$ (quadratic term), $g$ (linear term), $\Delta_k$\;
|
||||||
|
$p_N \gets - B^{-1} g$\;
|
||||||
|
|
||||||
|
\uIf{$\|p_N\| < \Delta_k$}{%
|
||||||
|
$p_k \gets p_N$\;
|
||||||
|
}\Else{%
|
||||||
|
$p_u = - \frac{g^T g}{g^T B g} g$\;
|
||||||
|
|
||||||
|
\uIf{$\|p_u\| > \Delta_k$}{%
|
||||||
|
compute $p_k$ with Cauchy point algorithm\;
|
||||||
|
}\Else{%
|
||||||
|
solve for $\tau$ the equality $\|p_u + \tau * (p_N - p_u)\|^2 =
|
||||||
|
\Delta_k^2$\;
|
||||||
|
$p_k \gets p_u + \tau \cdot (p_N - p_u)$\;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\caption{Dogleg method}
|
||||||
|
\end{algorithm}
|
||||||
|
|
||||||
|
## Point 3
|
||||||
|
|
||||||
|
The trust region, dogleg and Cauchy point algorithms were implemented
|
||||||
|
respectively in the files `trust_region.m`, `dogleg.m`, and `cauchy.m`.
|
||||||
|
|
||||||
|
## Point 4
|
||||||
|
|
||||||
|
### Taylor expansion
|
||||||
|
|
||||||
|
The Taylor expansion up the second order of the function is the following:
|
||||||
|
|
||||||
|
$$f(x_0, w) = f(x_0) + \langle\begin{bmatrix}48x^3 - 16xy + 2x - 2\\2y - 8x^2
|
||||||
|
\end{bmatrix}, w\rangle + \frac12 \langle\begin{bmatrix}144x^2 -16y + 2 - 16 &
|
||||||
|
-16 \\ -16 & 2 \end{bmatrix}w, w\rangle$$
|
||||||
|
|
||||||
|
### Minimization
|
||||||
|
|
||||||
|
The code used to minimize the function can be found in the MATLAB script
|
||||||
|
`main.m` under section 2.4. The resulting minimizer (found in 10 iterations) is:
|
||||||
|
|
||||||
|
$$x_m = \begin{bmatrix}1\\4\end{bmatrix}$$
|
||||||
|
|
||||||
|
### Energy landscape
|
||||||
|
|
||||||
|
The following figure shows a `surf` plot of the objective function overlayed
|
||||||
|
with the iterates used to reach the minimizer:
|
||||||
|
|
||||||
|
![Energy landscape of the function overlayed with iterates and steps (the white
|
||||||
|
dot is $x_0$ while the black dot is $x_m$)](./2-4-energy.png)
|
||||||
|
|
||||||
|
The code used to generate such plot can be found in the MATLAB script `main.m`
|
||||||
|
under section 2.4c.
|
||||||
|
|
||||||
|
## Point 5
|
||||||
|
|
||||||
|
### Minimization
|
||||||
|
|
||||||
|
The code used to minimize the function can be found in the MATLAB script
|
||||||
|
`main.m` under section 2.5. The resulting minimizer (found in 25 iterations) is:
|
||||||
|
|
||||||
|
$$x_m = \begin{bmatrix}1\\5\end{bmatrix}$$
|
||||||
|
|
||||||
|
### Energy landscape
|
||||||
|
|
||||||
|
The following figure shows a `surf` plot of the objective function overlayed
|
||||||
|
with the iterates used to reach the minimizer:
|
||||||
|
|
||||||
|
![Energy landscape of the Rosenbrock function overlayed with iterates and steps
|
||||||
|
(the white dot is $x_0$ while the black dot is $x_m$)](./2-5-energy.png)
|
||||||
|
|
||||||
|
The code used to generate such plot can be found in the MATLAB script `main.m`
|
||||||
|
under section 2.5b.
|
||||||
|
|
||||||
|
### Gradient norms
|
||||||
|
|
||||||
|
The following figure shows the logarithm of the norm of the gradient w.r.t.
|
||||||
|
iterations:
|
||||||
|
|
||||||
|
![Gradient norms (y-axis, log-scale) w.r.t. iteration number
|
||||||
|
(x-axis)](./2-5-gnorms.png)
|
||||||
|
|
||||||
|
The code used to generate such plot can be found in the MATLAB script `main.m`
|
||||||
|
under section 2.5c.
|
||||||
|
|
||||||
|
Comparing the behaviour shown above with the figures obtained in the previous
|
||||||
|
assignment for the Newton method with backtracking and the gradient descent with
|
||||||
|
backtracking, we notice that the trust-region method really behaves like a
|
||||||
|
compromise between the two methods. First of all, we notice that TR converges in
|
||||||
|
25 iterations, almost double of the number of iterations of regular NM +
|
||||||
|
backtracking. The actual behaviour of the curve is somewhat similar to the
|
||||||
|
Netwon gradient norms curve w.r.t. to the presence of spikes, which however are
|
||||||
|
less evident in the Trust region curve (probably due to Trust region method
|
||||||
|
alternating quadratic steps with linear or almost linear steps while iterating).
|
||||||
|
Finally, we notice that TR is the only method to have neighbouring iterations
|
||||||
|
having the exact same norm: this is probably due to some proposed iterations
|
||||||
|
steps not being validated by the acceptance criteria, which makes the method mot
|
||||||
|
move for some iterations.
|
||||||
|
|
Binary file not shown.
Reference in a new issue