--- header-includes: - \usepackage{amsmath} - \usepackage{hyperref} - \usepackage[utf8]{inputenc} - \usepackage[margin=2.5cm]{geometry} --- \title{Midterm -- Optimization Methods} \author{Claudio Maggioni} \maketitle # Exercise 1 ## Point 1 ### Question (a) As already covered in the course, the gradient of a standard quadratic form at a point $ x_0$ is equal to: $$ \nabla f(x_0) = A x_0 - b $$ Plugging in the definition of $x_0$ and knowing that $\nabla f(x_m) = A x_m - b = 0$ (according to the first necessary condition for a minimizer), we obtain: $$ \nabla f(x_0) = A (x_m + v) - b = A x_m + A v - b = b + \lambda v - b = \lambda v $$ ### Question (b) The steepest descent method takes exactly one iteration to reach the exact minimizer $x_m$ starting from the point $x_0$. This can be proven by first noticing that $x_m$ is a point standing in the line that first descent direction would trace, which is equal to: $$g(\alpha) = - \alpha \cdot \nabla f(x_0) = - \alpha \lambda v$$ For $\alpha = \frac{1}{\lambda}$, and plugging in the definition of $x_0 = x_m + v$, we would reach a new iterate $x_1$ equal to: $$x_1 = x_0 - \alpha \lambda v = x_0 - v = x_m + v - v = x_m $$ The only question that we need to answer now is why the SD algorithm would indeed choose $\alpha = \frac{1}{\lambda}$. To answer this, we recall that the SD algorithm chooses $\alpha$ by solving a linear minimization option along the step direction. Since we know $x_m$ is indeed the minimizer, $f(x_m)$ would be obviously strictly less that any other $f(x_1 = x_0 - \alpha \lambda v)$ with $\alpha \neq \frac{1}{\lambda}$. Therefore, since $x_1 = x_m$, we have proven SD converges to the minimizer in one iteration. ### Point 2 The right answer is choice (a), since the energy norm of the error indeed always decreases monotonically.