hw2: done all but 1.6

This commit is contained in:
Claudio Maggioni (maggicl) 2021-04-10 13:20:09 +02:00
parent fe6d660e32
commit c7f4f98737
4 changed files with 322 additions and 20 deletions

View file

@ -2,6 +2,7 @@
\usepackage[utf8]{inputenc}
\usepackage{float}
\usepackage{graphicx}
\usepackage[ruled,vlined]{algorithm2e}
\usepackage{subcaption}
\usepackage{amsmath}
\usepackage{pgfplots}
@ -72,7 +73,7 @@ Here $\phi(x) = 0$ would mean that $x$ is an exact solution of the system $Ax =
\[x = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\;\;\]
which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function and we can say an energy function for the problem exists.
which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function. Although the $\phi(x)$ given here is not strictly in a matrix vector product quadratic form, it is indeed a valid energy function and for different values of $N$ similar fashioned $\phi(x)$ could be derived. Therefore, we can say that the problem has an energy function.
\subsection{Once the new matrix has been derived, write the energy function related to the new problem
and the corresponding gradient and Hessian.}
@ -87,18 +88,33 @@ And since the objective is a standard quadratic form, the gradient is $\overline
\subsection{Write the Conjugate Gradient algorithm in the pdf and implement it Matlab code in a function
called \texttt{CGSolve}.}
See page 112 (133 for pdf) for the algorithm implementation
The Conjugate Gradient algorithm is the following:
The solution of this task can be found in Section 1.3 of the script \texttt{main.m} under the function \texttt{CGSolve}.
\begin{algorithm}[H]
\SetAlgoLined
Set $r_0 \gets Ax_0 - b$, $p_0 \gets r_0$, $k \gets 0$\;
\While{$r_k \neq 0$}{%
$\alpha_k \gets \frac{r^T_kr_k}{p^T_kAp_k}$\;
$x_{k+1} \gets x_k + \alpha_k p_k$\;
$r_{k+1} \gets r_k + \alpha_k A p_k$\;
$\beta_{k+1} \gets \frac{r^T_{k+1}r_{k+1}}{r^T_kr_k}$\;
$p_{k+1} \gets -r_{k+1} + \beta_{k+1}p_k$\;
$k \gets k + 1$\;
}
\caption{Conjugate Gradient algorithm}
\end{algorithm}
The MATLAB solution of this task can be found in Section 1.3 of the script \texttt{main.m} under the function \texttt{CGSolve}.
\subsection{Solve the Poisson problem.}
The solution of this task can be found in Section 1.4 of the script \texttt{main.m}.
Due to space constraints, the $R^{1000}$ column vector for the solution of $x$ is not shown here but can be easily derived by running the script and checking the variable \texttt{x} after the script execution.
\subsection{Plot the value of energy function and the norm of the gradient (here,
use semilogy) as functions of the iterations.}
The solution of this task can be found in Section 1.5 of the script \texttt{main.m}.
The code to generate the plots below can be found in Section 1.5 of the script \texttt{main.m}.
\begin{figure}[H]
\begin{subfigure}{0.5\textwidth}
@ -123,10 +139,10 @@ Because theorem 5.3 holds, which itself holds mainly because of this (5.10, page
Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways:
\begin{itemize}
\item $A =$ diag([1:10])
\item $A =$ diag(ones(1,10))
\item $A =$ diag([1, 1, 1, 3, 4, 5, 5, 5, 10, 10])
\item $A =$ diag([1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0])
\item $A_1 =$ diag([1:10])
\item $A_2 =$ diag(ones(1,10))
\item $A_3 =$ diag([1, 1, 1, 3, 4, 5, 5, 5, 10, 10])
\item $A_4 =$ diag([1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0])
\end{itemize}
\subsection{How many distinct eigenvalues has each matrix?}
@ -138,11 +154,60 @@ elements on its diagonal. So, in order, each A has respectively 10, 1, 5, and 10
Conjugate Gradient method to solve the system for each $A$.}
The solution of this task can be found in section 2.2 of the \texttt{main.m} MATLAB script.
Below are the chosen $b$ vector (which is unchanged between executions due to fixing the random generator seed in the script) and the solutions of $x$ for each matrix respectively.
\[b = \begin{bmatrix}0.814723686393179\\
0.905791937075619\\
0.126986816293506\\
0.913375856139019\\
0.632359246225410\\
0.0975404049994095\\
0.278498218867048\\
0.546881519204984\\
0.957506835434298\\
0.964888535199277\end{bmatrix}\;
x_1 = \begin{bmatrix}0.814723686393179\\
0.452895968537810\\
0.0423289387645020\\
0.228343964034755\\
0.126471849245082\\
0.0162567341665680\\
0.0397854598381500\\
0.0683601899006230\\
0.106389648381589\\
0.0964888535199280\end{bmatrix}\;x_2 = \begin{bmatrix}0.814723686393179\\
0.905791937075619\\
0.126986816293506\\
0.913375856139019\\
0.632359246225410\\
0.0975404049994095\\
0.278498218867048\\
0.546881519204984\\
0.957506835434298\\
0.964888535199277\end{bmatrix}\]\[x_3 = \begin{bmatrix}0.814723686393179\\
0.905791937075619\\
0.126986816293506\\
0.304458618713007\\
0.158089811556352\\
0.0195080809998819\\
0.0556996437734097\\
0.109376303840997\\
0.0957506835434296\\
0.0964888535199280\end{bmatrix}\;x_4 = \begin{bmatrix}0.740657896716011\\
0.754826614267239\\
0.0976821653904972\\
0.652411326111541\\
0.421572830214428\\
0.0609627567866090\\
0.163822480881755\\
0.303823066390868\\
0.503950965995613\\
0.482444267601989\end{bmatrix}\]
\subsection{Compute the logarithm energy norm of the error for each matrix
and plot it with respect to the number of iteration.}
The solution of this task can be found in section 2.3 of the \texttt{main.m} MATLAB script.
The code to generate the plots below and to compute the logarithm energy norm of the error can be found in section 2.3 of the \texttt{main.m} MATLAB script.
\begin{figure}[H]
\begin{subfigure}{0.5\textwidth}

View file

@ -3,20 +3,10 @@
%
% Note: exercises are not in the right order due to matlab constraints of
% functions inside of scripts.
A = [1 0 0 0;
-1 2 -1 0;
0 -1 2 1;
0 0 0 1];
b = [0; 1/9; 1/9; 0];
A \ b
clear
clc
close all
plots = 0;
plots = 1;
%% 1.4 - Solution for 1D Poisson for N=1000 using CG

View file

@ -0,0 +1,247 @@
\documentclass{scrartcl}
\usepackage[utf8]{inputenc}
\usepackage{float}
\usepackage{graphicx}
\usepackage[ruled,vlined]{algorithm2e}
\usepackage{subcaption}
\usepackage{amsmath}
\usepackage{pgfplots}
\pgfplotsset{compat=newest}
\usetikzlibrary{plotmarks}
\usetikzlibrary{arrows.meta}
\usepgfplotslibrary{patchplots}
\usepackage{grffile}
\usepackage{amsmath}
\usepackage{subcaption}
\usepgfplotslibrary{external}
\tikzexternalize
\usepackage[margin=2.5cm]{geometry}
% To compile:
% sed -i 's#title style={font=\\bfseries#title style={yshift=1ex, font=\\tiny\\bfseries#' *.tex
% luatex -enable-write18 -shellescape main.tex
\pgfplotsset{every x tick label/.append style={font=\tiny, yshift=0.5ex}}
\pgfplotsset{every title/.append style={font=\tiny, align=center}}
\pgfplotsset{every y tick label/.append style={font=\tiny, xshift=0.5ex}}
\pgfplotsset{every z tick label/.append style={font=\tiny, xshift=0.5ex}}
\setlength{\parindent}{0cm}
\setlength{\parskip}{0.5\baselineskip}
\title{Optimization methods -- Homework 2}
\author{Claudio Maggioni}
\begin{document}
\maketitle
\section{Exercise 1}
\subsection{Implement the matrix $A$ and the vector $b$, for the moment, without taking into consideration the
boundary conditions. As you can see, the matrix $A$ is not symmetric. Does an energy function of
the problem exist? Consider $N = 4$ and show your answer, explaining why it can or cannot exist.}
The implementation of a function that generates $A$ w.r.t. the parameter $N$ can be found in the MATLAB script
\texttt{main.m} under Section 1.1.
The matrix $A$ and the vector $b$ appear in the following form:
\[A = \begin{bmatrix}
1 & 0 & 0 & 0 & \ldots & 0 \\
-1 & 2 & -1 & 0 & \ldots & 0 \\
0 & -1 & 2 & -1 & \ldots & 0 \\
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & 0 & \ldots & 1 \\
\end{bmatrix}\;\;\;b = \begin{bmatrix}0\\h^2\\h^2\\\vdots\\0\end{bmatrix}\]
For $N = 4$, we can attempt to build a minimizer to the solve the system $Ax = b$. In order to find an $x$ such that $Ax = b$, we could define a minimizer $\phi(x)$ such that:
\[\phi(x) = \|b - Ax\|^2\]
Here $\phi(x) = 0$ would mean that $x$ is an exact solution of the system $Ax = b$. We can then attempt to write such minimizer for $N = 4$:
\[\phi(x) = \|b - Ax\|^2 = \left|\begin{bmatrix}0 - x_1\\\frac19 -x_1 + 2x_2 -x_3\\
\frac19 -x_2 +2x_3 -x_4\\0 - x_4\end{bmatrix}\right|^2 =\]\[= x_1^2 + \left(\frac19 - x_1 + 2x_2 - x_3\right)^2 + \left(\frac19 - x_2 + 2x_3 - x_4\right)^2 + x_4^2\]
\[\Delta \phi(x) = \begin{bmatrix}4x_1 - 4x_2 + 2x_3 -\frac29\\
-4x_1 +10x_2 -8x_3 + 2x_4 +\frac29\\
2x_1 -8x_2 +10x_3 + -4x_4 +\frac29\\
2x_2 - 4x_3 + 4x_4 -\frac29\end{bmatrix}\;\;\;\Delta^2 \phi(x) = \begin{bmatrix}4&-4&2&0\\-4&10&-8&2\\2&-8&10&-4\\0&2&-4&4\end{bmatrix}\]
As it can be seen from the Hessian calculation, the Hessian is positive definite forall $x$s. This means, by the sufficient condition of minimizers, that we can find a minimizer by solving $\Delta \phi(x) = 0$ (i.e. finding stationary points in the hyperspace defined by $\phi(x)$. Solving that, we find:
\[x = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\;\;\]
which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function and we can say an energy function for the problem exists.
\subsection{Once the new matrix has been derived, write the energy function related to the new problem
and the corresponding gradient and Hessian.}
As by the definitions of $A$ and $b$ given in the assignment, we already enforce $x_1 = x_n = 0$, since the first and the last term of the matrix-vector product $Ax$ are $x_1$ and $x_n$ respectively and the system of equations represented by $Ax = b$ would indeed include the equalities $x_1 = b_1 = 0$ and $x_n = b_n = 0$. Therefore, we can simply alter the matrix $A$ without any need to perform any other transformation in the system.
We therefore define $\overline{A}$ as a copy of $A$ where $\overline{A}_{2,1} = \overline{A}_{n-1, n} = 0$.
The objective function then becomes $\phi(x) = \frac12 x^T \overline{A} x - b^T x$.
And since the objective is a standard quadratic form, the gradient is $\overline{A}x - b$ while the Hessian is $\overline{A}$.
\subsection{Write the Conjugate Gradient algorithm in the pdf and implement it Matlab code in a function
called \texttt{CGSolve}.}
The Conjugate Gradient algorithm is the following:
\begin{algorithm}[H]
\SetAlgoLined
\KwResult{Conjugate Gradient Algorithm}
Set $r_0 \gets Ax_0 - b$, $p_0 \gets r_0$, $k \gets 0$\;
\While{$r_k \neq 0$}{%
$\alpha_k \gets \frac{r^T_kr_k}{p^T_kAp_k}$\;
$x_{k+1} \gets x_k + \alpha_k p_k$\;
$r_{k+1} \gets r_k + \alpha_k A p_k$\;
$\beta_{k+1} \gets \frac{r^T_{k+1}r_{k+1}}{r^T_kr_k}$\;
$p_{k+1} \gets -r_{k+1} + \beta_{k+1}p_k$\;
$k \gets k + 1$\;
}
\end{algorithm}
The MATLAB solution of this task can be found in Section 1.3 of the script \texttt{main.m} under the function \texttt{CGSolve}.
\subsection{Solve the Poisson problem.}
The solution of this task can be found in Section 1.4 of the script \texttt{main.m}.
Due to space constraints, the $R^{1000}$ column vector for the solution of $x$ is not shown here but can be easily derived by running the script and checking the variable \texttt{x} after the script execution.
\subsection{Plot the value of energy function and the norm of the gradient (here,
use semilogy) as functions of the iterations.}
The code to generate the plots below can be found in Section 1.5 of the script \texttt{main.m}.
\begin{figure}[H]
\begin{subfigure}{0.5\textwidth}
\resizebox{\textwidth}{\textwidth}{\input{obvalues}}
\caption{Objective function values w.r.t. iteration number}
\end{subfigure}
\begin{subfigure}{0.5\textwidth}
\resizebox{\textwidth}{\textwidth}{\input{gnorms}}
\caption{Norm of the gradient w.r.t. iteration number \\ (y-axis is log scaled)}
\end{subfigure}
\caption{Plots for Exercise 1.4.}
\end{figure}
\subsection{Finally, explain why the Conjugate Gradient method is a Krylov subspace method.}
Because theorem 5.3 holds, which itself holds mainly because of this (5.10, page 106 [127]):
\[r_{k+1} = r_k + a_k * A * p_k\]
\section{Exercise 2}
Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways:
\begin{itemize}
\item $A_1 =$ diag([1:10])
\item $A_2 =$ diag(ones(1,10))
\item $A_3 =$ diag([1, 1, 1, 3, 4, 5, 5, 5, 10, 10])
\item $A_4 =$ diag([1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0])
\end{itemize}
\subsection{How many distinct eigenvalues has each matrix?}
Each matrix has a distinct number of eigenvalues equal to the number of distinct
elements on its diagonal. So, in order, each A has respectively 10, 1, 5, and 10 distinct eigenvalues.
\subsection{Construct a right-hand side $b=$rand(10,1) and apply the
Conjugate Gradient method to solve the system for each $A$.}
The solution of this task can be found in section 2.2 of the \texttt{main.m} MATLAB script.
Below are the chosen $b$ vector (which is unchanged between executions due to fixing the random generator seed in the script) and the solutions of $x$ for each matrix respectively.
\[b = \begin{bmatrix}0.814723686393179\\
0.905791937075619\\
0.126986816293506\\
0.913375856139019\\
0.632359246225410\\
0.0975404049994095\\
0.278498218867048\\
0.546881519204984\\
0.957506835434298\\
0.964888535199277\end{bmatrix}\;
x_1 = \begin{bmatrix}0.814723686393179\\
0.452895968537810\\
0.0423289387645020\\
0.228343964034755\\
0.126471849245082\\
0.0162567341665680\\
0.0397854598381500\\
0.0683601899006230\\
0.106389648381589\\
0.0964888535199280\end{bmatrix}\;x_2 = \begin{bmatrix}0.814723686393179\\
0.905791937075619\\
0.126986816293506\\
0.913375856139019\\
0.632359246225410\\
0.0975404049994095\\
0.278498218867048\\
0.546881519204984\\
0.957506835434298\\
0.964888535199277\end{bmatrix}\]\[x_3 = \begin{bmatrix}0.814723686393179\\
0.905791937075619\\
0.126986816293506\\
0.304458618713007\\
0.158089811556352\\
0.0195080809998819\\
0.0556996437734097\\
0.109376303840997\\
0.0957506835434296\\
0.0964888535199280\end{bmatrix}\;x_4 = \begin{bmatrix}0.740657896716011\\
0.754826614267239\\
0.0976821653904972\\
0.652411326111541\\
0.421572830214428\\
0.0609627567866090\\
0.163822480881755\\
0.303823066390868\\
0.503950965995613\\
0.482444267601989\end{bmatrix}\]
\subsection{Compute the logarithm energy norm of the error for each matrix
and plot it with respect to the number of iteration.}
The code to generate the plots below and to compute the logarithm energy norm of the error can be found in section 2.3 of the \texttt{main.m} MATLAB script.
\begin{figure}[H]
\begin{subfigure}{0.5\textwidth}
\resizebox{\textwidth}{!}{\input{a1}}
\caption{First matrix}
\end{subfigure}
\begin{subfigure}{0.5\textwidth}
\resizebox{\textwidth}{!}{\input{a2}}
\caption{Second matrix}
\end{subfigure}
\begin{subfigure}{0.5\textwidth}
\resizebox{\textwidth}{!}{\input{a3}}
\caption{Third matrix}
\end{subfigure}
\begin{subfigure}{0.5\textwidth}
\resizebox{\textwidth}{!}{\input{a4}}
\caption{Fourth matrix}
\end{subfigure}
\caption{Plots of logarithm energy norm of the error per iteration. Minus infinity logarithms not shown in the plot.}
\end{figure}
\subsection{Comment on the convergence of the method for the different matrices. What can you say observing
the number of iterations obtained and the number of clusters of the eigenvalues of the related
matrix?}
The method converges quickly for each matrix. The fastest convergence surely happens for $A2$, which is
the identity matrix and therefore makes the $Ax = b$ problem trivial.
For all the other matrices, we observe the energy norm of the error decreasing exponentially as the iterations
increase, eventually reaching $0$ for the cases where the method converges exactly (namely on matrices $A1$ and $A3$).
Other than for the fourth matrix, the number of iterations is exactly equal
to the number of distinct eigenvalues for the matrix. That exception on the fourth matrix is simply due to the
tolerance termination condition holding true for an earlier iteration, i.e. we terminate early since we find an
approximation of $x$ with residual norm below $10^{-8}$.
\end{document}