hw1: ready for submission
This commit is contained in:
parent
2fd664a38e
commit
37d800cf67
7 changed files with 72 additions and 25 deletions
Binary file not shown.
81
hw1/main.tex → Claudio_Maggioni_1/Claudio_Maggioni_1.tex
Normal file → Executable file
81
hw1/main.tex → Claudio_Maggioni_1/Claudio_Maggioni_1.tex
Normal file → Executable file
|
@ -95,44 +95,34 @@ However, for if $A$ would be guaranteed to have full rank, the minimizer would b
|
|||
|
||||
$f(x,y)$ can be written in quadratic form in the following way:
|
||||
|
||||
\[f(v) = \frac12 \langle\begin{bmatrix}2 & 0\\0 & 2\mu\end{bmatrix}v, v\rangle + \langle\begin{bmatrix}0\\0\end{bmatrix}, x\rangle\]
|
||||
\[f(v) = v^T \begin{bmatrix}1 & 0\\0 & \mu\end{bmatrix} v + \begin{bmatrix}0\\0\end{bmatrix}^T v\]
|
||||
|
||||
where:
|
||||
|
||||
\[v = \begin{bmatrix}x\\y\end{bmatrix}\]
|
||||
|
||||
\subsection{Finding the optimal step length $\alpha$}
|
||||
|
||||
Considering $p$, our search direction, as the negative of the gradient (as dictated by the gradient method), we can rewrite the problem of finding an optimal step size $\alpha$ as the problem of minimizing the objective function along the line where $p$ belongs. This can be written as minimizing a function $l(\alpha)$, where:
|
||||
|
||||
\[l(\alpha) = \frac12 \langle A(x + \alpha p), x + \alpha p\rangle\]
|
||||
|
||||
To minimize we compute the gradient of $l(\alpha)$ and fix it to zero to find a stationary point, finding a value for $\alpha$ in function of $A$, $x$ and $p$.
|
||||
|
||||
\[l'(\alpha) = 2 \cdot \frac12 \langle A (x + \alpha p), p \rangle = \langle Ax, p \rangle + \alpha \langle Ap, p \rangle\]
|
||||
\[l'(\alpha) = 0 \Leftrightarrow \alpha = \frac{\langle Ax, p \rangle}{\langle Ap, p \rangle}\]
|
||||
|
||||
Since $A$ is s.p.d. by definition the hessian of function $l(\alpha)$ will always be positive, the stationary point found above is a minimizer of $l(\alpha)$ and thus the definition of $\alpha$ given above gives the optimal search step for the gradient method.
|
||||
|
||||
\subsection{Matlab implementation with \texttt{surf} and \texttt{contour} plots}
|
||||
\subsection{Matlab plotting with \texttt{surf} and \texttt{contour} plots}
|
||||
|
||||
The graphs generated by MATLAB are shown below:
|
||||
|
||||
\begin{figure}[h]
|
||||
\resizebox{\textwidth}{!}{\input{surf.tex}}
|
||||
\caption{Surf plots for different values of $\mu$}
|
||||
\label{fig:surf}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[h]
|
||||
\resizebox{\textwidth}{!}{\input{contour.tex}}
|
||||
\caption{Contour plots and iteration steps. Red has $x_0 = \begin{bmatrix}10&0\end{bmatrix}^T$,
|
||||
yellow has $x_0 = \begin{bmatrix}10&10\end{bmatrix}^T$, and blue has $x_0 = \begin{bmatrix}0&10\end{bmatrix}^T$}
|
||||
\label{fig:cont}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[h]
|
||||
\resizebox{\textwidth}{!}{\input{yseries.tex}}
|
||||
\caption{Iterations over values of the objective function. Red has $x_0 = \begin{bmatrix}10&0\end{bmatrix}^T$,
|
||||
yellow has $x_0 = \begin{bmatrix}10&10\end{bmatrix}^T$, and blue has $x_0 = \begin{bmatrix}0&10\end{bmatrix}^T.$}
|
||||
\label{fig:obj}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[h]
|
||||
|
@ -141,15 +131,72 @@ The graphs generated by MATLAB are shown below:
|
|||
exact minimizer no matter the value of $x_0$, so no gradient norm other than the very first one is recorded. Again,
|
||||
Red has $x_0 = \begin{bmatrix}10&0\end{bmatrix}^T$,
|
||||
yellow has $x_0 = \begin{bmatrix}10&10\end{bmatrix}^T$, and blue has $x_0 = \begin{bmatrix}0&10\end{bmatrix}^T.$}
|
||||
\label{fig:norm}
|
||||
\end{figure}
|
||||
|
||||
Isolines get stretched along the y axis as $\mu$ increases. For $\mu \neq 1$, points well far away from the axes are a
|
||||
Isolines (found in Figure \ref{fig:cont}) get stretched along the y axis as $\mu$ increases. For $\mu \neq 1$, points well far away from the axes are a
|
||||
problem since picking search directions and steps using the gradient method iterations will zig-zag
|
||||
to the minimizer reaching it slowly.
|
||||
|
||||
Additionally, from the \texttt{surf} plots, we can see that the behaviour of isolines is justified by a "stretching" of sorts
|
||||
From the \texttt{surf} plots (Figure \ref{fig:surf}), we can see that the behaviour of isolines is justified by a "stretching" of sorts
|
||||
of the function that causes the y axis to be steeper as $\mu$ increases.
|
||||
|
||||
\subsection{Finding the optimal step length $\alpha$}
|
||||
|
||||
Considering $p$, our search direction, as the negative of the gradient (as dictated by the gradient method), we can rewrite the problem of finding an optimal step size $\alpha$ as the problem of minimizing the objective function along the line where $p$ belongs. This can be written as minimizing a function $l(\alpha)$, where:
|
||||
|
||||
\[l(\alpha) = \langle A(x + \alpha p), x + \alpha p\rangle\]
|
||||
|
||||
To minimize we compute the gradient of $l(\alpha)$ and fix it to zero to find a stationary point, finding a value for $\alpha$ in function of $A$, $x$ and $p$.
|
||||
|
||||
\[l'(\alpha) = 2 \cdot \langle A (x + \alpha p), p \rangle = 2 \cdot \left( \langle Ax, p \rangle + \alpha \langle Ap, p \rangle \right)\]
|
||||
\[l'(\alpha) = 0 \Leftrightarrow \alpha = \frac{\langle Ax, p \rangle}{\langle Ap, p \rangle}\]
|
||||
|
||||
Since $A$ is s.p.d. by definition the hessian of function $l(\alpha)$ will always be positive, the stationary point found above is a minimizer of $l(\alpha)$ and thus the definition of $\alpha$ given above gives the optimal search step for the gradient method.
|
||||
|
||||
\subsection{Matlab code for the gradient method and convergence results}
|
||||
|
||||
The main MATLAB file to run to execute the gradient method is \texttt{ex3.m}. Convergence results and number of iterations are shown below, where the verbatim program output is written:
|
||||
|
||||
\begin{verbatim}
|
||||
u= 1 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 1 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 1 x0=[10,10] it= 2 x=[0,0]
|
||||
u= 2 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 2 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 2 x0=[10,10] it=18 x=[4.028537e-09,-1.007134e-09]
|
||||
u= 3 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 3 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 3 x0=[10,10] it=22 x=[1.281583e-09,-1.423981e-10]
|
||||
u= 4 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 4 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 4 x0=[10,10] it=22 x=[2.053616e-09,-1.283510e-10]
|
||||
u= 5 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 5 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 5 x0=[10,10] it=22 x=[1.397370e-09,-5.589479e-11]
|
||||
u= 6 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 6 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 6 x0=[10,10] it=22 x=[7.313877e-10,-2.031632e-11]
|
||||
u= 7 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 7 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 7 x0=[10,10] it=20 x=[3.868636e-09,-7.895176e-11]
|
||||
u= 8 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 8 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 8 x0=[10,10] it=20 x=[2.002149e-09,-3.128358e-11]
|
||||
u= 9 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u= 9 x0=[10, 0] it= 2 x=[0,0]
|
||||
u= 9 x0=[10,10] it=20 x=[1.052322e-09,-1.299163e-11]
|
||||
u=10 x0=[ 0,10] it= 2 x=[0,0]
|
||||
u=10 x0=[10, 0] it= 2 x=[0,0]
|
||||
u=10 x0=[10,10] it=20 x=[5.671954e-10,-5.671954e-12]
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{Comments on the various plots}
|
||||
|
||||
The objective function plots and the gradient norm plots can be found respectively in figures \ref{fig:obj} and \ref{fig:norm}.
|
||||
|
||||
What has been said before about the convergence of the gradient method is additionally showed in the last two sets of plots.
|
||||
From the objective function plot we can see that iterations starting from $\begin{bmatrix}10&10\end{bmatrix}^T$ (depicted in yellow) take the highest number of iterations to reach the minimizer (or an acceptable approximation of it). The zig-zag behaviour described before can be also observed in the contour plots, showing the iteration steps taken for each $\mu$ and starting from each $x_0$.
|
||||
|
|
@ -47,8 +47,8 @@ end
|
|||
sgtitle("Surf plots");
|
||||
|
||||
% comment these lines on submission
|
||||
addpath /home/claudio/git/matlab2tikz/src
|
||||
matlab2tikz('showInfo', false, './surf.tex')
|
||||
% addpath /home/claudio/git/matlab2tikz/src
|
||||
% matlab2tikz('showInfo', false, './surf.tex')
|
||||
|
||||
figure
|
||||
|
||||
|
@ -121,8 +121,8 @@ end
|
|||
sgtitle("Contour plots and iteration steps");
|
||||
|
||||
% comment these lines on submission
|
||||
addpath /home/claudio/git/matlab2tikz/src
|
||||
matlab2tikz('showInfo', false, './contour.tex')
|
||||
% addpath /home/claudio/git/matlab2tikz/src
|
||||
% matlab2tikz('showInfo', false, './contour.tex')
|
||||
|
||||
figure
|
||||
|
||||
|
@ -142,8 +142,8 @@ end
|
|||
sgtitle("Iterations over values of objective function");
|
||||
|
||||
% comment these lines on submission
|
||||
addpath /home/claudio/git/matlab2tikz/src
|
||||
matlab2tikz('showInfo', false, './yseries.tex')
|
||||
% addpath /home/claudio/git/matlab2tikz/src
|
||||
% matlab2tikz('showInfo', false, './yseries.tex')
|
||||
|
||||
figure
|
||||
|
||||
|
@ -163,6 +163,6 @@ end
|
|||
sgtitle("Iterations over log10 of gradient norms");
|
||||
|
||||
% comment these lines on submission
|
||||
addpath /home/claudio/git/matlab2tikz/src
|
||||
matlab2tikz('showInfo', false, './norms.tex')
|
||||
% addpath /home/claudio/git/matlab2tikz/src
|
||||
% matlab2tikz('showInfo', false, './norms.tex')
|
||||
|
Reference in a new issue