hw4: done 1

This commit is contained in:
Claudio Maggioni 2021-05-25 10:41:07 +02:00
parent abb4acaa1c
commit cf07f851a5
5 changed files with 145 additions and 5 deletions

View File

@ -0,0 +1,100 @@
<!-- vim: set ts=2 sw=2 et tw=80: -->
---
title: Midterm -- Optimization Methods
author: Claudio Maggioni
header-includes:
- \usepackage{amsmath}
- \usepackage{hyperref}
- \usepackage[utf8]{inputenc}
- \usepackage[margin=2.5cm]{geometry}
- \usepackage[ruled,vlined]{algorithm2e}
- \usepackage{float}
- \floatplacement{figure}{H}
- \hypersetup{colorlinks=true,linkcolor=blue}
---
\maketitle
# Exercise 1
## Exercise 1.1
The lagrangian is the following:
$$L(X,\lambda) = f(X) - \lambda \left(c(x) - 0\right) = -3x^2 + y^2 + 2x^2 +
2(x+y+z) - \lambda x^2 - \lambda y^2 -\lambda z^2 + \lambda =$$$$= (-3 -\lambda)x^2 + (1-
\lambda)y^2 + (2-\lambda)z^2 + 2 (x+y+z) + \lambda$$
The KKT conditions are the following:
First we have the condition on the partial derivatives of the Lagrangian w.r.t.
$X$:
$$\nabla_X L(X,\lambda) = \begin{bmatrix}(-3-\lambda)x^* + 1\\(1-\lambda)y^* +
1\\(2-\lambda)z^* + 1\end{bmatrix} = 0 \Leftrightarrow
\begin{bmatrix}x^*\\y^*\\z^*\end{bmatrix} =
\begin{bmatrix}\frac1{3+\lambda}\\-\frac1{1-\lambda}\\-\frac{1}{2-\lambda}\end{bmatrix}$$
Then we have the conditions on the equality constraint:
$$c(X) = {x^*}^2 + {y^*}^2 + {z^*}^2 - 1 = 0 \Leftrightarrow \|X^*\| = 1$$
$$\lambda^* c(X^*) = 0 \Leftarrow c(X^*) = 0 \text{ which is true if the above
condition is true.}$$
Since we have no inequality constraints, we don't need to apply the KKT
conditions realated to inequality constraints.
## Exercise 1.2
To find feasible solutions to the problem, we apply the KKT conditions. Since we
have a way to derive $X^*$ from $\lambda^*$ thanks to the first KKT condition,
we try to find the values of $\lambda$ that satisfies the second KKT condition:
$$c(x) = \left( \frac{1}{3+\lambda} \right)^2 + \left( -\frac{1}{1-\lambda} \right)^2 +
\left(-\frac{1}{2-\lambda}\right)^2 - 1 =
\frac{1}{(3+\lambda)^2} + \frac{1}{(1-\lambda)^2} + \frac{1}{(2-\lambda)^2} - 1 =$$$$=
\frac{(1-\lambda)^2(2-\lambda)^2 + (3+\lambda)^2(2-\lambda)^2 +
(3+\lambda)^2
(1-\lambda)^2 - (3+\lambda)^2 (1-\lambda)^2 (2-\lambda)^2}{(3+\lambda)^2
(1-\lambda)^2 (2-\lambda)^2} = 0
\Leftrightarrow$$$$\Leftrightarrow
(1-\lambda)^2(2-\lambda)^2 + (3+\lambda)^2(2-\lambda)^2 +
(3+\lambda)^2
(1-\lambda)^2 - (3+\lambda)^2 (1-\lambda)^2 (2-\lambda)^2 = 0
\Leftrightarrow$$$$\Leftrightarrow
(\lambda^4 - 6\lambda^3 + 13\lambda^2 - 12\lambda + 16) +
(\lambda^4 + 2\lambda^3 - 11\lambda^2 - 12\lambda + 36) +
(\lambda^4 + 4\lambda^3 - 2\lambda^2 - 12\lambda + 9)$$$$
+ (-\lambda^5 -14\lambda^4 +12\lambda^3 +49\lambda^2 -84\lambda + 36) = $$$$
=-\lambda^5 +17\lambda^4 -12\lambda^3 -49\lambda^2 +48\lambda +13 = 0
\Leftrightarrow $$$$ \Leftrightarrow
\lambda = \lambda_1 \approx -0.224 \lor
\lambda = \lambda_2 \approx -1.892 \lor
\lambda = \lambda_3 \approx 3.149 \lor
\lambda = \lambda_4 \approx -4.035$$
We then compute $X$ from each solution and evaluate the objective each time:
$$X = \begin{bmatrix}\frac1{3+\lambda}\\-\frac1{1-\lambda}\\
-\frac{1}{2-\lambda}\end{bmatrix}
\Leftrightarrow$$$$\Leftrightarrow
X = X_1 \approx \begin{bmatrix}0.360\\-0.817\\-0.450\end{bmatrix} \lor
X = X_2 \approx \begin{bmatrix}0.902\\-0.346\\-0.257\end{bmatrix} \lor
X = X_3 \approx \begin{bmatrix}0.163\\0.465\\0.870\end{bmatrix} \lor
X = X_4 \approx \begin{bmatrix}-0.966\\-0.199\\-0.166\end{bmatrix}$$
$$f(X_1) = -1.1304 \;\; f(X_2) = -1.59219 \;\; f(X_3) = 4.64728 \;\; f(X_4) =
-5.36549$$
We therefore choose $(\lambda_4, X_4)$ since $f(X_4)$ is the smallest objective
value out of all the feasible points. Therefore, the solution to the
minimization problem is:
$$X \approx \begin{bmatrix}-0.966\\-0.199\\-0.166\end{bmatrix}$$

Binary file not shown.

View File

@ -0,0 +1,31 @@
clc
X = [1/(3+l); (-1)/(1-l); (-1)/(2-l)]
syms l
a1 = (3+l)^2;
a2 = (1-l)^2;
a3 = (2-l)^2;
t1 = a2*a3
t2 = a1*a3
t3 = a1*a2
t4 = a1*a2*a3
c1 = fliplr(coeffs(t1))
c2 = fliplr(coeffs(t2))
c3 = fliplr(coeffs(t3))
c4 = fliplr(coeffs(t4))
ctot = fliplr(coeffs(t1+t2+t3-t4))
sol = double(solve(t1+t2+t3-t4==0, l, 'Real', true))
for i=1:size(sol, 1)
Xi = double(subs(X,l,sol(i)))
Li = sol(i);
Ci = norm(Xi, 2)^2 - 1;
Yi = sum(Xi .* Xi .* [-3;1;2]) + sum(2 * Xi);
fprintf("lambda=%.03f ci=%g y=%g", Li, Ci, Yi);
end

View File

@ -281,11 +281,20 @@ since the model quality is good.
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
In the trust region method the energy of the iterates does not always decrease
monotonically. This is due to the fact that the algorithm could actively reject
a step if the performance measure factor $\rho_k$ is less that a given constant
$\eta$. In this case, the new iterate is equal to the old one, no step is taken
and thus the energy norm does not decrease but stays the same.
When using the trust region method, the energy of the iterates decreases
monotonically. This is true because by construction the algorithm either makes
the next iterate equal to the current one (i.e. when the performance measure
$\rho_k$ is too poor to accept a step) or applies a linear, quadratic, or
blended descending step to the current iterate.
When a step is taken, the step
by definition should be a solution (or a close approximation of such solution)
of the energy minimization problem inside the trust region itself. Therefore,
the step cannot lead to a point that has higher energy than the one from the
current iterate.
Therefore, the energy either stays constant or decreases at every single
iteration, and therefore the energy decreases monotonically.
## Point 2