diff --git a/hw1/main.pdf b/Claudio_Maggioni_1/Claudio_Maggioni_1.pdf similarity index 91% rename from hw1/main.pdf rename to Claudio_Maggioni_1/Claudio_Maggioni_1.pdf index 2bee882..2d2655c 100644 Binary files a/hw1/main.pdf and b/Claudio_Maggioni_1/Claudio_Maggioni_1.pdf differ diff --git a/hw1/main.tex b/Claudio_Maggioni_1/Claudio_Maggioni_1.tex old mode 100644 new mode 100755 similarity index 73% rename from hw1/main.tex rename to Claudio_Maggioni_1/Claudio_Maggioni_1.tex index c4e78e7..fa0cc82 --- a/hw1/main.tex +++ b/Claudio_Maggioni_1/Claudio_Maggioni_1.tex @@ -95,44 +95,34 @@ However, for if $A$ would be guaranteed to have full rank, the minimizer would b $f(x,y)$ can be written in quadratic form in the following way: -\[f(v) = \frac12 \langle\begin{bmatrix}2 & 0\\0 & 2\mu\end{bmatrix}v, v\rangle + \langle\begin{bmatrix}0\\0\end{bmatrix}, x\rangle\] +\[f(v) = v^T \begin{bmatrix}1 & 0\\0 & \mu\end{bmatrix} v + \begin{bmatrix}0\\0\end{bmatrix}^T v\] where: \[v = \begin{bmatrix}x\\y\end{bmatrix}\] -\subsection{Finding the optimal step length $\alpha$} - -Considering $p$, our search direction, as the negative of the gradient (as dictated by the gradient method), we can rewrite the problem of finding an optimal step size $\alpha$ as the problem of minimizing the objective function along the line where $p$ belongs. This can be written as minimizing a function $l(\alpha)$, where: - -\[l(\alpha) = \frac12 \langle A(x + \alpha p), x + \alpha p\rangle\] - -To minimize we compute the gradient of $l(\alpha)$ and fix it to zero to find a stationary point, finding a value for $\alpha$ in function of $A$, $x$ and $p$. - -\[l'(\alpha) = 2 \cdot \frac12 \langle A (x + \alpha p), p \rangle = \langle Ax, p \rangle + \alpha \langle Ap, p \rangle\] -\[l'(\alpha) = 0 \Leftrightarrow \alpha = \frac{\langle Ax, p \rangle}{\langle Ap, p \rangle}\] - -Since $A$ is s.p.d. by definition the hessian of function $l(\alpha)$ will always be positive, the stationary point found above is a minimizer of $l(\alpha)$ and thus the definition of $\alpha$ given above gives the optimal search step for the gradient method. - -\subsection{Matlab implementation with \texttt{surf} and \texttt{contour} plots} +\subsection{Matlab plotting with \texttt{surf} and \texttt{contour} plots} The graphs generated by MATLAB are shown below: \begin{figure}[h] \resizebox{\textwidth}{!}{\input{surf.tex}} \caption{Surf plots for different values of $\mu$} +\label{fig:surf} \end{figure} \begin{figure}[h] \resizebox{\textwidth}{!}{\input{contour.tex}} \caption{Contour plots and iteration steps. Red has $x_0 = \begin{bmatrix}10&0\end{bmatrix}^T$, yellow has $x_0 = \begin{bmatrix}10&10\end{bmatrix}^T$, and blue has $x_0 = \begin{bmatrix}0&10\end{bmatrix}^T$} +\label{fig:cont} \end{figure} \begin{figure}[h] \resizebox{\textwidth}{!}{\input{yseries.tex}} \caption{Iterations over values of the objective function. Red has $x_0 = \begin{bmatrix}10&0\end{bmatrix}^T$, yellow has $x_0 = \begin{bmatrix}10&10\end{bmatrix}^T$, and blue has $x_0 = \begin{bmatrix}0&10\end{bmatrix}^T.$} +\label{fig:obj} \end{figure} \begin{figure}[h] @@ -141,15 +131,72 @@ The graphs generated by MATLAB are shown below: exact minimizer no matter the value of $x_0$, so no gradient norm other than the very first one is recorded. Again, Red has $x_0 = \begin{bmatrix}10&0\end{bmatrix}^T$, yellow has $x_0 = \begin{bmatrix}10&10\end{bmatrix}^T$, and blue has $x_0 = \begin{bmatrix}0&10\end{bmatrix}^T.$} +\label{fig:norm} \end{figure} -Isolines get stretched along the y axis as $\mu$ increases. For $\mu \neq 1$, points well far away from the axes are a +Isolines (found in Figure \ref{fig:cont}) get stretched along the y axis as $\mu$ increases. For $\mu \neq 1$, points well far away from the axes are a problem since picking search directions and steps using the gradient method iterations will zig-zag to the minimizer reaching it slowly. -Additionally, from the \texttt{surf} plots, we can see that the behaviour of isolines is justified by a "stretching" of sorts +From the \texttt{surf} plots (Figure \ref{fig:surf}), we can see that the behaviour of isolines is justified by a "stretching" of sorts of the function that causes the y axis to be steeper as $\mu$ increases. +\subsection{Finding the optimal step length $\alpha$} + +Considering $p$, our search direction, as the negative of the gradient (as dictated by the gradient method), we can rewrite the problem of finding an optimal step size $\alpha$ as the problem of minimizing the objective function along the line where $p$ belongs. This can be written as minimizing a function $l(\alpha)$, where: + +\[l(\alpha) = \langle A(x + \alpha p), x + \alpha p\rangle\] + +To minimize we compute the gradient of $l(\alpha)$ and fix it to zero to find a stationary point, finding a value for $\alpha$ in function of $A$, $x$ and $p$. + +\[l'(\alpha) = 2 \cdot \langle A (x + \alpha p), p \rangle = 2 \cdot \left( \langle Ax, p \rangle + \alpha \langle Ap, p \rangle \right)\] +\[l'(\alpha) = 0 \Leftrightarrow \alpha = \frac{\langle Ax, p \rangle}{\langle Ap, p \rangle}\] + +Since $A$ is s.p.d. by definition the hessian of function $l(\alpha)$ will always be positive, the stationary point found above is a minimizer of $l(\alpha)$ and thus the definition of $\alpha$ given above gives the optimal search step for the gradient method. + +\subsection{Matlab code for the gradient method and convergence results} + +The main MATLAB file to run to execute the gradient method is \texttt{ex3.m}. Convergence results and number of iterations are shown below, where the verbatim program output is written: + +\begin{verbatim} +u= 1 x0=[ 0,10] it= 2 x=[0,0] +u= 1 x0=[10, 0] it= 2 x=[0,0] +u= 1 x0=[10,10] it= 2 x=[0,0] +u= 2 x0=[ 0,10] it= 2 x=[0,0] +u= 2 x0=[10, 0] it= 2 x=[0,0] +u= 2 x0=[10,10] it=18 x=[4.028537e-09,-1.007134e-09] +u= 3 x0=[ 0,10] it= 2 x=[0,0] +u= 3 x0=[10, 0] it= 2 x=[0,0] +u= 3 x0=[10,10] it=22 x=[1.281583e-09,-1.423981e-10] +u= 4 x0=[ 0,10] it= 2 x=[0,0] +u= 4 x0=[10, 0] it= 2 x=[0,0] +u= 4 x0=[10,10] it=22 x=[2.053616e-09,-1.283510e-10] +u= 5 x0=[ 0,10] it= 2 x=[0,0] +u= 5 x0=[10, 0] it= 2 x=[0,0] +u= 5 x0=[10,10] it=22 x=[1.397370e-09,-5.589479e-11] +u= 6 x0=[ 0,10] it= 2 x=[0,0] +u= 6 x0=[10, 0] it= 2 x=[0,0] +u= 6 x0=[10,10] it=22 x=[7.313877e-10,-2.031632e-11] +u= 7 x0=[ 0,10] it= 2 x=[0,0] +u= 7 x0=[10, 0] it= 2 x=[0,0] +u= 7 x0=[10,10] it=20 x=[3.868636e-09,-7.895176e-11] +u= 8 x0=[ 0,10] it= 2 x=[0,0] +u= 8 x0=[10, 0] it= 2 x=[0,0] +u= 8 x0=[10,10] it=20 x=[2.002149e-09,-3.128358e-11] +u= 9 x0=[ 0,10] it= 2 x=[0,0] +u= 9 x0=[10, 0] it= 2 x=[0,0] +u= 9 x0=[10,10] it=20 x=[1.052322e-09,-1.299163e-11] +u=10 x0=[ 0,10] it= 2 x=[0,0] +u=10 x0=[10, 0] it= 2 x=[0,0] +u=10 x0=[10,10] it=20 x=[5.671954e-10,-5.671954e-12] +\end{verbatim} + + + +\subsection{Comments on the various plots} + +The objective function plots and the gradient norm plots can be found respectively in figures \ref{fig:obj} and \ref{fig:norm}. + What has been said before about the convergence of the gradient method is additionally showed in the last two sets of plots. From the objective function plot we can see that iterations starting from $\begin{bmatrix}10&10\end{bmatrix}^T$ (depicted in yellow) take the highest number of iterations to reach the minimizer (or an acceptable approximation of it). The zig-zag behaviour described before can be also observed in the contour plots, showing the iteration steps taken for each $\mu$ and starting from each $x_0$. diff --git a/hw1/contour.tex b/Claudio_Maggioni_1/contour.tex similarity index 100% rename from hw1/contour.tex rename to Claudio_Maggioni_1/contour.tex diff --git a/hw1/ex3.m b/Claudio_Maggioni_1/ex3.m similarity index 88% rename from hw1/ex3.m rename to Claudio_Maggioni_1/ex3.m index c6f71f8..9f9312a 100644 --- a/hw1/ex3.m +++ b/Claudio_Maggioni_1/ex3.m @@ -47,8 +47,8 @@ end sgtitle("Surf plots"); % comment these lines on submission -addpath /home/claudio/git/matlab2tikz/src -matlab2tikz('showInfo', false, './surf.tex') +% addpath /home/claudio/git/matlab2tikz/src +% matlab2tikz('showInfo', false, './surf.tex') figure @@ -121,8 +121,8 @@ end sgtitle("Contour plots and iteration steps"); % comment these lines on submission -addpath /home/claudio/git/matlab2tikz/src -matlab2tikz('showInfo', false, './contour.tex') +% addpath /home/claudio/git/matlab2tikz/src +% matlab2tikz('showInfo', false, './contour.tex') figure @@ -142,8 +142,8 @@ end sgtitle("Iterations over values of objective function"); % comment these lines on submission -addpath /home/claudio/git/matlab2tikz/src -matlab2tikz('showInfo', false, './yseries.tex') +% addpath /home/claudio/git/matlab2tikz/src +% matlab2tikz('showInfo', false, './yseries.tex') figure @@ -163,6 +163,6 @@ end sgtitle("Iterations over log10 of gradient norms"); % comment these lines on submission -addpath /home/claudio/git/matlab2tikz/src -matlab2tikz('showInfo', false, './norms.tex') +% addpath /home/claudio/git/matlab2tikz/src +% matlab2tikz('showInfo', false, './norms.tex') diff --git a/hw1/norms.tex b/Claudio_Maggioni_1/norms.tex similarity index 100% rename from hw1/norms.tex rename to Claudio_Maggioni_1/norms.tex diff --git a/hw1/surf.tex b/Claudio_Maggioni_1/surf.tex similarity index 100% rename from hw1/surf.tex rename to Claudio_Maggioni_1/surf.tex diff --git a/hw1/yseries.tex b/Claudio_Maggioni_1/yseries.tex similarity index 100% rename from hw1/yseries.tex rename to Claudio_Maggioni_1/yseries.tex