diff --git a/Claudio_Maggioni_2/Claudio_Maggioni_2.pdf b/Claudio_Maggioni_2/Claudio_Maggioni_2.pdf index 0d02e9e..a709578 100644 Binary files a/Claudio_Maggioni_2/Claudio_Maggioni_2.pdf and b/Claudio_Maggioni_2/Claudio_Maggioni_2.pdf differ diff --git a/Claudio_Maggioni_2/Claudio_Maggioni_2.tex b/Claudio_Maggioni_2/Claudio_Maggioni_2.tex index 8a2184a..024e451 100755 --- a/Claudio_Maggioni_2/Claudio_Maggioni_2.tex +++ b/Claudio_Maggioni_2/Claudio_Maggioni_2.tex @@ -2,6 +2,7 @@ \usepackage[utf8]{inputenc} \usepackage{float} \usepackage{graphicx} +\usepackage[ruled,vlined]{algorithm2e} \usepackage{subcaption} \usepackage{amsmath} \usepackage{pgfplots} @@ -72,7 +73,7 @@ Here $\phi(x) = 0$ would mean that $x$ is an exact solution of the system $Ax = \[x = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\;\;\] -which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function and we can say an energy function for the problem exists. + which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function. Although the $\phi(x)$ given here is not strictly in a matrix vector product quadratic form, it is indeed a valid energy function and for different values of $N$ similar fashioned $\phi(x)$ could be derived. Therefore, we can say that the problem has an energy function. \subsection{Once the new matrix has been derived, write the energy function related to the new problem and the corresponding gradient and Hessian.} @@ -87,18 +88,33 @@ And since the objective is a standard quadratic form, the gradient is $\overline \subsection{Write the Conjugate Gradient algorithm in the pdf and implement it Matlab code in a function called \texttt{CGSolve}.} -See page 112 (133 for pdf) for the algorithm implementation +The Conjugate Gradient algorithm is the following: - The solution of this task can be found in Section 1.3 of the script \texttt{main.m} under the function \texttt{CGSolve}. +\begin{algorithm}[H] +\SetAlgoLined +Set $r_0 \gets Ax_0 - b$, $p_0 \gets r_0$, $k \gets 0$\; +\While{$r_k \neq 0$}{% + $\alpha_k \gets \frac{r^T_kr_k}{p^T_kAp_k}$\; + $x_{k+1} \gets x_k + \alpha_k p_k$\; + $r_{k+1} \gets r_k + \alpha_k A p_k$\; + $\beta_{k+1} \gets \frac{r^T_{k+1}r_{k+1}}{r^T_kr_k}$\; + $p_{k+1} \gets -r_{k+1} + \beta_{k+1}p_k$\; + $k \gets k + 1$\; +} + \caption{Conjugate Gradient algorithm} +\end{algorithm} + +The MATLAB solution of this task can be found in Section 1.3 of the script \texttt{main.m} under the function \texttt{CGSolve}. \subsection{Solve the Poisson problem.} The solution of this task can be found in Section 1.4 of the script \texttt{main.m}. +Due to space constraints, the $R^{1000}$ column vector for the solution of $x$ is not shown here but can be easily derived by running the script and checking the variable \texttt{x} after the script execution. \subsection{Plot the value of energy function and the norm of the gradient (here, use semilogy) as functions of the iterations.} -The solution of this task can be found in Section 1.5 of the script \texttt{main.m}. +The code to generate the plots below can be found in Section 1.5 of the script \texttt{main.m}. \begin{figure}[H] \begin{subfigure}{0.5\textwidth} @@ -123,10 +139,10 @@ Because theorem 5.3 holds, which itself holds mainly because of this (5.10, page Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways: \begin{itemize} - \item $A =$ diag([1:10]) - \item $A =$ diag(ones(1,10)) - \item $A =$ diag([1, 1, 1, 3, 4, 5, 5, 5, 10, 10]) - \item $A =$ diag([1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]) + \item $A_1 =$ diag([1:10]) + \item $A_2 =$ diag(ones(1,10)) + \item $A_3 =$ diag([1, 1, 1, 3, 4, 5, 5, 5, 10, 10]) + \item $A_4 =$ diag([1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]) \end{itemize} \subsection{How many distinct eigenvalues has each matrix?} @@ -138,11 +154,60 @@ elements on its diagonal. So, in order, each A has respectively 10, 1, 5, and 10 Conjugate Gradient method to solve the system for each $A$.} The solution of this task can be found in section 2.2 of the \texttt{main.m} MATLAB script. +Below are the chosen $b$ vector (which is unchanged between executions due to fixing the random generator seed in the script) and the solutions of $x$ for each matrix respectively. + +\[b = \begin{bmatrix}0.814723686393179\\ +0.905791937075619\\ +0.126986816293506\\ +0.913375856139019\\ +0.632359246225410\\ +0.0975404049994095\\ +0.278498218867048\\ +0.546881519204984\\ +0.957506835434298\\ +0.964888535199277\end{bmatrix}\; +x_1 = \begin{bmatrix}0.814723686393179\\ +0.452895968537810\\ +0.0423289387645020\\ +0.228343964034755\\ +0.126471849245082\\ +0.0162567341665680\\ +0.0397854598381500\\ +0.0683601899006230\\ +0.106389648381589\\ +0.0964888535199280\end{bmatrix}\;x_2 = \begin{bmatrix}0.814723686393179\\ +0.905791937075619\\ +0.126986816293506\\ +0.913375856139019\\ +0.632359246225410\\ +0.0975404049994095\\ +0.278498218867048\\ +0.546881519204984\\ +0.957506835434298\\ +0.964888535199277\end{bmatrix}\]\[x_3 = \begin{bmatrix}0.814723686393179\\ +0.905791937075619\\ +0.126986816293506\\ +0.304458618713007\\ +0.158089811556352\\ +0.0195080809998819\\ +0.0556996437734097\\ +0.109376303840997\\ +0.0957506835434296\\ +0.0964888535199280\end{bmatrix}\;x_4 = \begin{bmatrix}0.740657896716011\\ +0.754826614267239\\ +0.0976821653904972\\ +0.652411326111541\\ +0.421572830214428\\ +0.0609627567866090\\ +0.163822480881755\\ +0.303823066390868\\ +0.503950965995613\\ +0.482444267601989\end{bmatrix}\] \subsection{Compute the logarithm energy norm of the error for each matrix and plot it with respect to the number of iteration.} -The solution of this task can be found in section 2.3 of the \texttt{main.m} MATLAB script. +The code to generate the plots below and to compute the logarithm energy norm of the error can be found in section 2.3 of the \texttt{main.m} MATLAB script. \begin{figure}[H] \begin{subfigure}{0.5\textwidth} diff --git a/Claudio_Maggioni_2/main.m b/Claudio_Maggioni_2/main.m index 1436180..441f466 100644 --- a/Claudio_Maggioni_2/main.m +++ b/Claudio_Maggioni_2/main.m @@ -3,20 +3,10 @@ % % Note: exercises are not in the right order due to matlab constraints of % functions inside of scripts. - -A = [1 0 0 0; - -1 2 -1 0; - 0 -1 2 1; - 0 0 0 1]; - -b = [0; 1/9; 1/9; 0]; - -A \ b - clear clc close all -plots = 0; +plots = 1; %% 1.4 - Solution for 1D Poisson for N=1000 using CG diff --git a/Claudio_Maggioni_2/smarthut.md b/Claudio_Maggioni_2/smarthut.md new file mode 100644 index 0000000..03572f8 --- /dev/null +++ b/Claudio_Maggioni_2/smarthut.md @@ -0,0 +1,247 @@ +\documentclass{scrartcl} +\usepackage[utf8]{inputenc} +\usepackage{float} +\usepackage{graphicx} +\usepackage[ruled,vlined]{algorithm2e} +\usepackage{subcaption} +\usepackage{amsmath} +\usepackage{pgfplots} +\pgfplotsset{compat=newest} +\usetikzlibrary{plotmarks} +\usetikzlibrary{arrows.meta} +\usepgfplotslibrary{patchplots} +\usepackage{grffile} +\usepackage{amsmath} +\usepackage{subcaption} +\usepgfplotslibrary{external} +\tikzexternalize +\usepackage[margin=2.5cm]{geometry} + +% To compile: +% sed -i 's#title style={font=\\bfseries#title style={yshift=1ex, font=\\tiny\\bfseries#' *.tex +% luatex -enable-write18 -shellescape main.tex + +\pgfplotsset{every x tick label/.append style={font=\tiny, yshift=0.5ex}} +\pgfplotsset{every title/.append style={font=\tiny, align=center}} +\pgfplotsset{every y tick label/.append style={font=\tiny, xshift=0.5ex}} +\pgfplotsset{every z tick label/.append style={font=\tiny, xshift=0.5ex}} + +\setlength{\parindent}{0cm} +\setlength{\parskip}{0.5\baselineskip} + +\title{Optimization methods -- Homework 2} +\author{Claudio Maggioni} + +\begin{document} + +\maketitle + +\section{Exercise 1} + +\subsection{Implement the matrix $A$ and the vector $b$, for the moment, without taking into consideration the +boundary conditions. As you can see, the matrix $A$ is not symmetric. Does an energy function of +the problem exist? Consider $N = 4$ and show your answer, explaining why it can or cannot exist.} + +The implementation of a function that generates $A$ w.r.t. the parameter $N$ can be found in the MATLAB script +\texttt{main.m} under Section 1.1. + +The matrix $A$ and the vector $b$ appear in the following form: + +\[A = \begin{bmatrix} + 1 & 0 & 0 & 0 & \ldots & 0 \\ + -1 & 2 & -1 & 0 & \ldots & 0 \\ + 0 & -1 & 2 & -1 & \ldots & 0 \\ + \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ + 0 & 0 & 0 & 0 & \ldots & 1 \\ + \end{bmatrix}\;\;\;b = \begin{bmatrix}0\\h^2\\h^2\\\vdots\\0\end{bmatrix}\] + +For $N = 4$, we can attempt to build a minimizer to the solve the system $Ax = b$. In order to find an $x$ such that $Ax = b$, we could define a minimizer $\phi(x)$ such that: + +\[\phi(x) = \|b - Ax\|^2\] + +Here $\phi(x) = 0$ would mean that $x$ is an exact solution of the system $Ax = b$. We can then attempt to write such minimizer for $N = 4$: + +\[\phi(x) = \|b - Ax\|^2 = \left|\begin{bmatrix}0 - x_1\\\frac19 -x_1 + 2x_2 -x_3\\ +\frac19 -x_2 +2x_3 -x_4\\0 - x_4\end{bmatrix}\right|^2 =\]\[= x_1^2 + \left(\frac19 - x_1 + 2x_2 - x_3\right)^2 + \left(\frac19 - x_2 + 2x_3 - x_4\right)^2 + x_4^2\] + +\[\Delta \phi(x) = \begin{bmatrix}4x_1 - 4x_2 + 2x_3 -\frac29\\ +-4x_1 +10x_2 -8x_3 + 2x_4 +\frac29\\ +2x_1 -8x_2 +10x_3 + -4x_4 +\frac29\\ + 2x_2 - 4x_3 + 4x_4 -\frac29\end{bmatrix}\;\;\;\Delta^2 \phi(x) = \begin{bmatrix}4&-4&2&0\\-4&10&-8&2\\2&-8&10&-4\\0&2&-4&4\end{bmatrix}\] + + As it can be seen from the Hessian calculation, the Hessian is positive definite forall $x$s. This means, by the sufficient condition of minimizers, that we can find a minimizer by solving $\Delta \phi(x) = 0$ (i.e. finding stationary points in the hyperspace defined by $\phi(x)$. Solving that, we find: + +\[x = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\;\;\] + +which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function and we can say an energy function for the problem exists. + +\subsection{Once the new matrix has been derived, write the energy function related to the new problem +and the corresponding gradient and Hessian.} + +As by the definitions of $A$ and $b$ given in the assignment, we already enforce $x_1 = x_n = 0$, since the first and the last term of the matrix-vector product $Ax$ are $x_1$ and $x_n$ respectively and the system of equations represented by $Ax = b$ would indeed include the equalities $x_1 = b_1 = 0$ and $x_n = b_n = 0$. Therefore, we can simply alter the matrix $A$ without any need to perform any other transformation in the system. + + We therefore define $\overline{A}$ as a copy of $A$ where $\overline{A}_{2,1} = \overline{A}_{n-1, n} = 0$. + +The objective function then becomes $\phi(x) = \frac12 x^T \overline{A} x - b^T x$. +And since the objective is a standard quadratic form, the gradient is $\overline{A}x - b$ while the Hessian is $\overline{A}$. + +\subsection{Write the Conjugate Gradient algorithm in the pdf and implement it Matlab code in a function +called \texttt{CGSolve}.} + +The Conjugate Gradient algorithm is the following: + +\begin{algorithm}[H] +\SetAlgoLined +\KwResult{Conjugate Gradient Algorithm} +Set $r_0 \gets Ax_0 - b$, $p_0 \gets r_0$, $k \gets 0$\; +\While{$r_k \neq 0$}{% + $\alpha_k \gets \frac{r^T_kr_k}{p^T_kAp_k}$\; + $x_{k+1} \gets x_k + \alpha_k p_k$\; + $r_{k+1} \gets r_k + \alpha_k A p_k$\; + $\beta_{k+1} \gets \frac{r^T_{k+1}r_{k+1}}{r^T_kr_k}$\; + $p_{k+1} \gets -r_{k+1} + \beta_{k+1}p_k$\; + $k \gets k + 1$\; +} +\end{algorithm} + +The MATLAB solution of this task can be found in Section 1.3 of the script \texttt{main.m} under the function \texttt{CGSolve}. + +\subsection{Solve the Poisson problem.} + +The solution of this task can be found in Section 1.4 of the script \texttt{main.m}. +Due to space constraints, the $R^{1000}$ column vector for the solution of $x$ is not shown here but can be easily derived by running the script and checking the variable \texttt{x} after the script execution. + +\subsection{Plot the value of energy function and the norm of the gradient (here, +use semilogy) as functions of the iterations.} + +The code to generate the plots below can be found in Section 1.5 of the script \texttt{main.m}. + +\begin{figure}[H] + \begin{subfigure}{0.5\textwidth} + \resizebox{\textwidth}{\textwidth}{\input{obvalues}} + \caption{Objective function values w.r.t. iteration number} + \end{subfigure} + \begin{subfigure}{0.5\textwidth} + \resizebox{\textwidth}{\textwidth}{\input{gnorms}} + \caption{Norm of the gradient w.r.t. iteration number \\ (y-axis is log scaled)} + \end{subfigure} + \caption{Plots for Exercise 1.4.} +\end{figure} + +\subsection{Finally, explain why the Conjugate Gradient method is a Krylov subspace method.} + +Because theorem 5.3 holds, which itself holds mainly because of this (5.10, page 106 [127]): + +\[r_{k+1} = r_k + a_k * A * p_k\] + +\section{Exercise 2} + +Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways: + +\begin{itemize} + \item $A_1 =$ diag([1:10]) + \item $A_2 =$ diag(ones(1,10)) + \item $A_3 =$ diag([1, 1, 1, 3, 4, 5, 5, 5, 10, 10]) + \item $A_4 =$ diag([1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]) +\end{itemize} + +\subsection{How many distinct eigenvalues has each matrix?} + +Each matrix has a distinct number of eigenvalues equal to the number of distinct +elements on its diagonal. So, in order, each A has respectively 10, 1, 5, and 10 distinct eigenvalues. + +\subsection{Construct a right-hand side $b=$rand(10,1) and apply the +Conjugate Gradient method to solve the system for each $A$.} + +The solution of this task can be found in section 2.2 of the \texttt{main.m} MATLAB script. +Below are the chosen $b$ vector (which is unchanged between executions due to fixing the random generator seed in the script) and the solutions of $x$ for each matrix respectively. + +\[b = \begin{bmatrix}0.814723686393179\\ +0.905791937075619\\ +0.126986816293506\\ +0.913375856139019\\ +0.632359246225410\\ +0.0975404049994095\\ +0.278498218867048\\ +0.546881519204984\\ +0.957506835434298\\ +0.964888535199277\end{bmatrix}\; +x_1 = \begin{bmatrix}0.814723686393179\\ +0.452895968537810\\ +0.0423289387645020\\ +0.228343964034755\\ +0.126471849245082\\ +0.0162567341665680\\ +0.0397854598381500\\ +0.0683601899006230\\ +0.106389648381589\\ +0.0964888535199280\end{bmatrix}\;x_2 = \begin{bmatrix}0.814723686393179\\ +0.905791937075619\\ +0.126986816293506\\ +0.913375856139019\\ +0.632359246225410\\ +0.0975404049994095\\ +0.278498218867048\\ +0.546881519204984\\ +0.957506835434298\\ +0.964888535199277\end{bmatrix}\]\[x_3 = \begin{bmatrix}0.814723686393179\\ +0.905791937075619\\ +0.126986816293506\\ +0.304458618713007\\ +0.158089811556352\\ +0.0195080809998819\\ +0.0556996437734097\\ +0.109376303840997\\ +0.0957506835434296\\ +0.0964888535199280\end{bmatrix}\;x_4 = \begin{bmatrix}0.740657896716011\\ +0.754826614267239\\ +0.0976821653904972\\ +0.652411326111541\\ +0.421572830214428\\ +0.0609627567866090\\ +0.163822480881755\\ +0.303823066390868\\ +0.503950965995613\\ +0.482444267601989\end{bmatrix}\] + +\subsection{Compute the logarithm energy norm of the error for each matrix +and plot it with respect to the number of iteration.} + +The code to generate the plots below and to compute the logarithm energy norm of the error can be found in section 2.3 of the \texttt{main.m} MATLAB script. + +\begin{figure}[H] + \begin{subfigure}{0.5\textwidth} + \resizebox{\textwidth}{!}{\input{a1}} + \caption{First matrix} + \end{subfigure} + \begin{subfigure}{0.5\textwidth} + \resizebox{\textwidth}{!}{\input{a2}} + \caption{Second matrix} + \end{subfigure} + \begin{subfigure}{0.5\textwidth} + \resizebox{\textwidth}{!}{\input{a3}} + \caption{Third matrix} + \end{subfigure} + \begin{subfigure}{0.5\textwidth} + \resizebox{\textwidth}{!}{\input{a4}} + \caption{Fourth matrix} + \end{subfigure} + \caption{Plots of logarithm energy norm of the error per iteration. Minus infinity logarithms not shown in the plot.} +\end{figure} + +\subsection{Comment on the convergence of the method for the different matrices. What can you say observing +the number of iterations obtained and the number of clusters of the eigenvalues of the related +matrix?} + +The method converges quickly for each matrix. The fastest convergence surely happens for $A2$, which is +the identity matrix and therefore makes the $Ax = b$ problem trivial. + +For all the other matrices, we observe the energy norm of the error decreasing exponentially as the iterations +increase, eventually reaching $0$ for the cases where the method converges exactly (namely on matrices $A1$ and $A3$). + +Other than for the fourth matrix, the number of iterations is exactly equal +to the number of distinct eigenvalues for the matrix. That exception on the fourth matrix is simply due to the +tolerance termination condition holding true for an earlier iteration, i.e. we terminate early since we find an +approximation of $x$ with residual norm below $10^{-8}$. + +\end{document}