hw4: done 1
This commit is contained in:
parent
abb4acaa1c
commit
cf07f851a5
5 changed files with 145 additions and 5 deletions
100
Claudio_Maggioni_4/Claudio_Maggioni_4.md
Normal file
100
Claudio_Maggioni_4/Claudio_Maggioni_4.md
Normal file
|
@ -0,0 +1,100 @@
|
||||||
|
<!-- vim: set ts=2 sw=2 et tw=80: -->
|
||||||
|
|
||||||
|
---
|
||||||
|
title: Midterm -- Optimization Methods
|
||||||
|
author: Claudio Maggioni
|
||||||
|
header-includes:
|
||||||
|
- \usepackage{amsmath}
|
||||||
|
- \usepackage{hyperref}
|
||||||
|
- \usepackage[utf8]{inputenc}
|
||||||
|
- \usepackage[margin=2.5cm]{geometry}
|
||||||
|
- \usepackage[ruled,vlined]{algorithm2e}
|
||||||
|
- \usepackage{float}
|
||||||
|
- \floatplacement{figure}{H}
|
||||||
|
- \hypersetup{colorlinks=true,linkcolor=blue}
|
||||||
|
|
||||||
|
---
|
||||||
|
\maketitle
|
||||||
|
|
||||||
|
# Exercise 1
|
||||||
|
|
||||||
|
## Exercise 1.1
|
||||||
|
|
||||||
|
The lagrangian is the following:
|
||||||
|
|
||||||
|
$$L(X,\lambda) = f(X) - \lambda \left(c(x) - 0\right) = -3x^2 + y^2 + 2x^2 +
|
||||||
|
2(x+y+z) - \lambda x^2 - \lambda y^2 -\lambda z^2 + \lambda =$$$$= (-3 -\lambda)x^2 + (1-
|
||||||
|
\lambda)y^2 + (2-\lambda)z^2 + 2 (x+y+z) + \lambda$$
|
||||||
|
|
||||||
|
The KKT conditions are the following:
|
||||||
|
|
||||||
|
First we have the condition on the partial derivatives of the Lagrangian w.r.t.
|
||||||
|
$X$:
|
||||||
|
|
||||||
|
$$\nabla_X L(X,\lambda) = \begin{bmatrix}(-3-\lambda)x^* + 1\\(1-\lambda)y^* +
|
||||||
|
1\\(2-\lambda)z^* + 1\end{bmatrix} = 0 \Leftrightarrow
|
||||||
|
\begin{bmatrix}x^*\\y^*\\z^*\end{bmatrix} =
|
||||||
|
\begin{bmatrix}\frac1{3+\lambda}\\-\frac1{1-\lambda}\\-\frac{1}{2-\lambda}\end{bmatrix}$$
|
||||||
|
|
||||||
|
Then we have the conditions on the equality constraint:
|
||||||
|
|
||||||
|
$$c(X) = {x^*}^2 + {y^*}^2 + {z^*}^2 - 1 = 0 \Leftrightarrow \|X^*\| = 1$$
|
||||||
|
|
||||||
|
$$\lambda^* c(X^*) = 0 \Leftarrow c(X^*) = 0 \text{ which is true if the above
|
||||||
|
condition is true.}$$
|
||||||
|
|
||||||
|
Since we have no inequality constraints, we don't need to apply the KKT
|
||||||
|
conditions realated to inequality constraints.
|
||||||
|
|
||||||
|
## Exercise 1.2
|
||||||
|
|
||||||
|
To find feasible solutions to the problem, we apply the KKT conditions. Since we
|
||||||
|
have a way to derive $X^*$ from $\lambda^*$ thanks to the first KKT condition,
|
||||||
|
we try to find the values of $\lambda$ that satisfies the second KKT condition:
|
||||||
|
|
||||||
|
$$c(x) = \left( \frac{1}{3+\lambda} \right)^2 + \left( -\frac{1}{1-\lambda} \right)^2 +
|
||||||
|
\left(-\frac{1}{2-\lambda}\right)^2 - 1 =
|
||||||
|
\frac{1}{(3+\lambda)^2} + \frac{1}{(1-\lambda)^2} + \frac{1}{(2-\lambda)^2} - 1 =$$$$=
|
||||||
|
\frac{(1-\lambda)^2(2-\lambda)^2 + (3+\lambda)^2(2-\lambda)^2 +
|
||||||
|
(3+\lambda)^2
|
||||||
|
(1-\lambda)^2 - (3+\lambda)^2 (1-\lambda)^2 (2-\lambda)^2}{(3+\lambda)^2
|
||||||
|
(1-\lambda)^2 (2-\lambda)^2} = 0
|
||||||
|
\Leftrightarrow$$$$\Leftrightarrow
|
||||||
|
(1-\lambda)^2(2-\lambda)^2 + (3+\lambda)^2(2-\lambda)^2 +
|
||||||
|
(3+\lambda)^2
|
||||||
|
(1-\lambda)^2 - (3+\lambda)^2 (1-\lambda)^2 (2-\lambda)^2 = 0
|
||||||
|
\Leftrightarrow$$$$\Leftrightarrow
|
||||||
|
(\lambda^4 - 6\lambda^3 + 13\lambda^2 - 12\lambda + 16) +
|
||||||
|
(\lambda^4 + 2\lambda^3 - 11\lambda^2 - 12\lambda + 36) +
|
||||||
|
(\lambda^4 + 4\lambda^3 - 2\lambda^2 - 12\lambda + 9)$$$$
|
||||||
|
+ (-\lambda^5 -14\lambda^4 +12\lambda^3 +49\lambda^2 -84\lambda + 36) = $$$$
|
||||||
|
=-\lambda^5 +17\lambda^4 -12\lambda^3 -49\lambda^2 +48\lambda +13 = 0
|
||||||
|
\Leftrightarrow $$$$ \Leftrightarrow
|
||||||
|
\lambda = \lambda_1 \approx -0.224 \lor
|
||||||
|
\lambda = \lambda_2 \approx -1.892 \lor
|
||||||
|
\lambda = \lambda_3 \approx 3.149 \lor
|
||||||
|
\lambda = \lambda_4 \approx -4.035$$
|
||||||
|
|
||||||
|
We then compute $X$ from each solution and evaluate the objective each time:
|
||||||
|
|
||||||
|
$$X = \begin{bmatrix}\frac1{3+\lambda}\\-\frac1{1-\lambda}\\
|
||||||
|
-\frac{1}{2-\lambda}\end{bmatrix}
|
||||||
|
\Leftrightarrow$$$$\Leftrightarrow
|
||||||
|
X = X_1 \approx \begin{bmatrix}0.360\\-0.817\\-0.450\end{bmatrix} \lor
|
||||||
|
X = X_2 \approx \begin{bmatrix}0.902\\-0.346\\-0.257\end{bmatrix} \lor
|
||||||
|
X = X_3 \approx \begin{bmatrix}0.163\\0.465\\0.870\end{bmatrix} \lor
|
||||||
|
X = X_4 \approx \begin{bmatrix}-0.966\\-0.199\\-0.166\end{bmatrix}$$
|
||||||
|
|
||||||
|
$$f(X_1) = -1.1304 \;\; f(X_2) = -1.59219 \;\; f(X_3) = 4.64728 \;\; f(X_4) =
|
||||||
|
-5.36549$$
|
||||||
|
|
||||||
|
We therefore choose $(\lambda_4, X_4)$ since $f(X_4)$ is the smallest objective
|
||||||
|
value out of all the feasible points. Therefore, the solution to the
|
||||||
|
minimization problem is:
|
||||||
|
|
||||||
|
$$X \approx \begin{bmatrix}-0.966\\-0.199\\-0.166\end{bmatrix}$$
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
BIN
Claudio_Maggioni_4/Claudio_Maggioni_4.pdf
Normal file
BIN
Claudio_Maggioni_4/Claudio_Maggioni_4.pdf
Normal file
Binary file not shown.
31
Claudio_Maggioni_4/gsppn.m
Normal file
31
Claudio_Maggioni_4/gsppn.m
Normal file
|
@ -0,0 +1,31 @@
|
||||||
|
clc
|
||||||
|
|
||||||
|
X = [1/(3+l); (-1)/(1-l); (-1)/(2-l)]
|
||||||
|
|
||||||
|
syms l
|
||||||
|
a1 = (3+l)^2;
|
||||||
|
a2 = (1-l)^2;
|
||||||
|
a3 = (2-l)^2;
|
||||||
|
|
||||||
|
t1 = a2*a3
|
||||||
|
t2 = a1*a3
|
||||||
|
t3 = a1*a2
|
||||||
|
t4 = a1*a2*a3
|
||||||
|
|
||||||
|
c1 = fliplr(coeffs(t1))
|
||||||
|
c2 = fliplr(coeffs(t2))
|
||||||
|
c3 = fliplr(coeffs(t3))
|
||||||
|
c4 = fliplr(coeffs(t4))
|
||||||
|
|
||||||
|
ctot = fliplr(coeffs(t1+t2+t3-t4))
|
||||||
|
|
||||||
|
sol = double(solve(t1+t2+t3-t4==0, l, 'Real', true))
|
||||||
|
|
||||||
|
for i=1:size(sol, 1)
|
||||||
|
Xi = double(subs(X,l,sol(i)))
|
||||||
|
Li = sol(i);
|
||||||
|
Ci = norm(Xi, 2)^2 - 1;
|
||||||
|
Yi = sum(Xi .* Xi .* [-3;1;2]) + sum(2 * Xi);
|
||||||
|
fprintf("lambda=%.03f ci=%g y=%g", Li, Ci, Yi);
|
||||||
|
end
|
||||||
|
|
|
@ -281,11 +281,20 @@ since the model quality is good.
|
||||||
|
|
||||||
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
|
### (f) Does the energy decrease monotonically when Trust Region method is employed? Justify your answer.
|
||||||
|
|
||||||
In the trust region method the energy of the iterates does not always decrease
|
When using the trust region method, the energy of the iterates decreases
|
||||||
monotonically. This is due to the fact that the algorithm could actively reject
|
monotonically. This is true because by construction the algorithm either makes
|
||||||
a step if the performance measure factor $\rho_k$ is less that a given constant
|
the next iterate equal to the current one (i.e. when the performance measure
|
||||||
$\eta$. In this case, the new iterate is equal to the old one, no step is taken
|
$\rho_k$ is too poor to accept a step) or applies a linear, quadratic, or
|
||||||
and thus the energy norm does not decrease but stays the same.
|
blended descending step to the current iterate.
|
||||||
|
|
||||||
|
When a step is taken, the step
|
||||||
|
by definition should be a solution (or a close approximation of such solution)
|
||||||
|
of the energy minimization problem inside the trust region itself. Therefore,
|
||||||
|
the step cannot lead to a point that has higher energy than the one from the
|
||||||
|
current iterate.
|
||||||
|
|
||||||
|
Therefore, the energy either stays constant or decreases at every single
|
||||||
|
iteration, and therefore the energy decreases monotonically.
|
||||||
|
|
||||||
## Point 2
|
## Point 2
|
||||||
|
|
||||||
|
|
Binary file not shown.
Reference in a new issue