diff --git a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md new file mode 100644 index 0000000..c7b3d7c --- /dev/null +++ b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.md @@ -0,0 +1,58 @@ + + +--- +header-includes: +- \usepackage{amsmath} +- \usepackage{hyperref} +- \usepackage[utf8]{inputenc} +- \usepackage[margin=2.5cm]{geometry} +--- +\title{Midterm -- Optimization Methods} +\author{Claudio Maggioni} +\maketitle + +# Exercise 1 + +## Point 1 + +### Question (a) + +As already covered in the course, the gradient of a standard quadratic form at a +point $ x_0$ is equal to: + +$$ \nabla f(x_0) = A x_0 - b $$ + +Plugging in the definition of $x_0$ and knowing that $\nabla f(x_m) = A x_m - b = 0$ +(according to the first necessary condition for a minimizer), we obtain: + +$$ \nabla f(x_0) = A (x_m + v) - b = A x_m + A v - b = b + \lambda v - b = +\lambda v $$ + +### Question (b) + +The steepest descent method takes exactly one iteration to reach the exact +minimizer $x_m$ starting from the point $x_0$. This can be proven by first +noticing that $x_m$ is a point standing in the line that first descent direction +would trace, which is equal to: + +$$g(\alpha) = - \alpha \cdot \nabla f(x_0) = - \alpha \lambda v$$ + +For $\alpha = \frac{1}{\lambda}$, and plugging in the definition of $x_0 = x_m + +v$, we would reach a new iterate $x_1$ equal to: + +$$x_1 = x_0 - \alpha \lambda v = x_0 - v = x_m + v - v = x_m $$ + +The only question that we need to answer now is why the SD algorithm would +indeed choose $\alpha = \frac{1}{\lambda}$. To answer this, we recall that the +SD algorithm chooses $\alpha$ by solving a linear minimization option along the +step direction. Since we know $x_m$ is indeed the minimizer, $f(x_m)$ would be +obviously strictly less that any other $f(x_1 = x_0 - \alpha \lambda v)$ with +$\alpha \neq \frac{1}{\lambda}$. + +Therefore, since $x_1 = x_m$, we have proven SD +converges to the minimizer in one iteration. + +### Point 2 + +The right answer is choice (a), since the energy norm of the error indeed always +decreases monotonically. diff --git a/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf new file mode 100644 index 0000000..b262bee Binary files /dev/null and b/Claudio_Maggioni_midterm/Claudio_Maggioni_midterm.pdf differ