NC/mp5/Project_5_Maggioni_Claudio.tex

\documentclass[unicode,11pt,a4paper,oneside,numbers=endperiod,openany]{scrartcl}

\usepackage{graphicx}
\usepackage{subcaption}
\usepackage{amsmath}
\input{assignment.sty}
\usepackage{pgfplots}
\pgfplotsset{compat=newest}
\usetikzlibrary{plotmarks}
\usetikzlibrary{arrows.meta}
\usepgfplotslibrary{patchplots}
\usepackage{grffile}
\usepackage{amsmath}
\usepackage{subcaption}
\usepgfplotslibrary{external}
\tikzexternalize

\begin{document}

\setassignment
\setduedate{Wednesday, December 02, 2020, 11:59 PM}

\serieheader{Numerical Computing}{2020}{Student: Claudio Maggioni}{Discussed with: Gianmarco De Vita (3.3)}{Solution for Project 5}{}
\newline

\assignmentpolicy

The purpose of this assignment is to gain insight on the theoretical and
numerical properties of the Conjugate Gradient method. Here we use this method
in an image processing application with the goal of deblur an image given the
exact (noise-free) blurred image and the original transformation matrix. Note
that the ``noise-free" simplification is essential for us to solve this problem
in the scope of this assignment.

\section{General Questions [10 points]}
\subsection{What is the size of the matrix $A$?}
$A$ is an $n^2$ by $n^2$ matrix, where $n$ is the width and height in pixels
of the image to transform.

\subsection{How many diagonals bands does $A$ have?}
$A$ as $d^2$ diagonal bands, where $d$ is strictly an order of magnitude below
$n$.

\subsection{What is the length of the vectorized blur image $b$?}
$b$ is the row-vectorized form of the image pixel matrix, and thus has
dimensions 1 by $n^2$.

\section{Properties of A [10 points]}
\subsection{ If $A$ is not symmetric, how would this affect $\tilde{A}$?}
If $A$ were not symmetric, then $\tilde{A}$ would not be positive definite since
by definition $\tilde{A} = A A^T$, thus not satifying the assumptions taken when
solving the system.

\subsection{ Explain why solving $Ax = b$ for $x$ is equivalent to
minimizing $x^T A x - b^T x$ over $x$.}

First, we can say that:

\[f(x) = \frac12 x^T A x - b^T x = \frac12\langle Ax,x \rangle -
\langle b,x \rangle\]

Then by taking the derivative of $f(x)$ w.r.t. $x$ we have (assuming $A$ is
\textit{spd}, which it is):

\[f'(x) = \frac12 A^T x + \frac12Ax - b = Ax - b\]

which for $f'(x) = 0$ will be equivalent to solving $Ax = b$. By taking the
second derivative we have:

\[f''(x) = A > 0\]

since $A$ is positive definite. Therefore, we can say that the absolute minima
of $f(x)$ is the solution for $Ax = b$.

\section{Conjugate Gradient [40 points]}
\subsection{ Write a function for the conjugate gradient solver
\texttt{[x,rvec]=myCG(A,b,x0,max\_itr,tol)}, where \texttt{x}
and \texttt{rvec} are, respectively, the solution value and a
vector containing the residual at every iteration.}

The implementation can be found in file \texttt{myCG.m} in the source directory.
The test code for the function \texttt{myCG} can be found in the \texttt{test.m} file.

\subsection{ In order to validate your implementation, solve the system
defined by \texttt{A\_test.mat} and \texttt{b\_test.mat}. Plot
the convergence (residual vs iteration).}

The plot of the squared residual 2-norms over all iterations can be found in Figure
\ref{fig:plot1}.

\begin{figure}[h]
	\centering
	\resizebox{0.6\textwidth}{!}{\input{test_semilogy}}
	\caption{Semilog plot of the plot of the squared residual 2-norms over all iterations}\label{fig:plot1}
\end{figure}

\subsection{ Plot the eigenvalues of \texttt{A\_test.mat} and comment on the
condition number and convergence rate.}

The eigenvalues of A can be found in figure
\ref{fig:plot2}.

\begin{figure}[h]
	\centering
	\resizebox{0.6\textwidth}{!}{\input{A_eig}}
	\caption{Semilog plot of the eigenvalues of A}\label{fig:plot2}
\end{figure}

The condition number for matrix $A$ according to \texttt{cond(...)} is $\approx 1.3700 \cdot 10^6$,
which is rather ill conditioned. The eigenvalue plot agrees with the condition number, by showing that there are significant differences in the magnitude of the eigenvalues of \texttt{A\_test}.

The two methods agree due to the fact that the condition number is computed by using singular values,
which in turn are derived by eigenvalues. This fact was demonstrated practically by Dr. Edoardo Vecchi, \textit{future PhD}
using this MATLAB snippet.

\begin{verbatim}
X = A_test'*A_test;
singular_values = sqrt(eig(X));
norm(sort(singular_values,'descend') - svd(A_test))/numel(A_test)
\end{verbatim}

The final norm is less than $10^{12}$, showning that the definition of
\texttt{singular\_values}, which is a generalization of the computation of
eigenvalues for rectangular matrices, is equivalent to MATLAB's \texttt{svd(...)} other than approximation errors.


\section{Debluring problem [40 points]}
\subsection{ Solve the debluring problem for the blurred image matrix
\texttt{B.mat} and transformation matrix \texttt{A.mat} using
your routine \texttt{myCG} and Matlab's preconditioned conjugate
gradient \texttt{pcg}. As a preconditioner, use \texttt{ichol}
to get the incomplete Cholesky factors and set routine type to
\texttt{nofill} with $\alpha=0.01$ for the diagonal shift (see
Matlab documentation). Solve the system with both solvers using
$max\_iter=200$ $tol= 10^{-6}$. Plot the convergence (residual
vs iteration) of each solver and display the original and final
deblurred image.}

My implementation is in the file \texttt{deblurring.m}. Plots for the original image, the deblurred image from
\texttt{myCG}, the deblurred image from \texttt{pcg}, and a semi-logarithmic plot on the y-axis of the residuals
from the two conjugate gradient functions over the iteration count can be found respectively in figure \ref{fig:orig},
\ref{fig:mycg}, \ref{fig:pcg}, and \ref{fig:rvec}.

As a terminating condition for the \texttt{myCG} implementation, I chose to check if the residual divided by the 2-norm of
$b$ is less than the given tolerance. This mimicks the behaviour of MATLAB's \texttt{pcg} to allow for a better comparison.

\begin{figure}[h]
	\begin{subfigure}{0.33\textwidth}
	\centering
		\resizebox{0.8\textwidth}{!}{\input{img_orig}}
	\caption{Original image grayscale matrix}\label{fig:orig}
	\end{subfigure}
	\begin{subfigure}{0.33\textwidth}
	\centering
		\resizebox{0.8\textwidth}{!}{\input{img_my}}
	\caption{Deblurred image using \texttt{myCG}}\label{fig:mycg}
	\end{subfigure}
	\begin{subfigure}{0.33\textwidth}
	\centering
		\resizebox{0.8\textwidth}{!}{\input{img_rcg}}
	\caption{Deblurred image using \texttt{rcg}}\label{fig:pcg}
	\end{subfigure}
	\caption{Blurred and deblurred images}
\end{figure}
\begin{figure}[h]
	\centering
	\resizebox{0.6\textwidth}{!}{\input{res_log}}
	\caption{Residuals of \texttt{myCG} (in blue) and \texttt{rcg} (in orange) over iteration count (y axis is a log scale)}\label{fig:rvec}
\end{figure}
\subsection{ When would \texttt{pcg} be worth the added computational cost?
What about if you are debluring lots of images with the same
blur operator?}

The \textit{pcg} algorithm provided by MATLAB would be worth for many deblurring operations useing the same blur operator, since the cost of computing the incomplete cholesky decomposition (i.e. \texttt{ichol}) can be payed only once and thus amortized. \texttt{myCG} is better for few iterations thanks to not needing any seed that is expensive to compute.


\end{document}
mp5: done 1-3 2020-11-25 14:20:13 +00:00			`\documentclass[unicode,11pt,a4paper,oneside,numbers=endperiod,openany]{scrartcl}`

			`\usepackage{graphicx}`
			`\usepackage{subcaption}`
			`\usepackage{amsmath}`
			`\input{assignment.sty}`
			`\usepackage{pgfplots}`
			`\pgfplotsset{compat=newest}`
			`\usetikzlibrary{plotmarks}`
			`\usetikzlibrary{arrows.meta}`
			`\usepgfplotslibrary{patchplots}`
			`\usepackage{grffile}`
			`\usepackage{amsmath}`
			`\usepackage{subcaption}`
			`\usepgfplotslibrary{external}`
			`\tikzexternalize`

			`\begin{document}`

			`\setassignment`
			`\setduedate{Wednesday, December 02, 2020, 11:59 PM}`

mp5: 3.3 correction 2020-12-02 12:27:51 +00:00			`\serieheader{Numerical Computing}{2020}{Student: Claudio Maggioni}{Discussed with: Gianmarco De Vita (3.3)}{Solution for Project 5}{}`
mp5: done 1-3 2020-11-25 14:20:13 +00:00			`\newline`

			`\assignmentpolicy`

			`The purpose of this assignment is to gain insight on the theoretical and`
			`numerical properties of the Conjugate Gradient method. Here we use this method`
			`in an image processing application with the goal of deblur an image given the`
			`exact (noise-free) blurred image and the original transformation matrix. Note`
			that the ``noise-free" simplification is essential for us to solve this problem
			`in the scope of this assignment.`

			`\section{General Questions [10 points]}`
			`\subsection{What is the size of the matrix $A$?}`
			`$A$ is an $n^2$ by $n^2$ matrix, where $n$ is the width and height in pixels`
			`of the image to transform.`

			`\subsection{How many diagonals bands does $A$ have?}`
			`$A$ as $d^2$ diagonal bands, where $d$ is strictly an order of magnitude below`
			`$n$.`

			`\subsection{What is the length of the vectorized blur image $b$?}`
			`$b$ is the row-vectorized form of the image pixel matrix, and thus has`
			`dimensions 1 by $n^2$.`

			`\section{Properties of A [10 points]}`
			`\subsection{ If $A$ is not symmetric, how would this affect $\tilde{A}$?}`
			`If $A$ were not symmetric, then $\tilde{A}$ would not be positive definite since`
			`by definition $\tilde{A} = A A^T$, thus not satifying the assumptions taken when`
			`solving the system.`

			`\subsection{ Explain why solving $Ax = b$ for $x$ is equivalent to`
			`minimizing $x^T A x - b^T x$ over $x$.}`

			`First, we can say that:`

			`\[f(x) = \frac12 x^T A x - b^T x = \frac12\langle Ax,x \rangle -`
			`\langle b,x \rangle\]`

			`Then by taking the derivative of $f(x)$ w.r.t. $x$ we have (assuming $A$ is`
			`\textit{spd}, which it is):`

			`\[f'(x) = \frac12 A^T x + \frac12Ax - b = Ax - b\]`

			`which for $f'(x) = 0$ will be equivalent to solving $Ax = b$. By taking the`
			`second derivative we have:`

			`\[f''(x) = A > 0\]`

			`since $A$ is positive definite. Therefore, we can say that the absolute minima`
			`of $f(x)$ is the solution for $Ax = b$.`

			`\section{Conjugate Gradient [40 points]}`
			`\subsection{ Write a function for the conjugate gradient solver`
			`\texttt{[x,rvec]=myCG(A,b,x0,max\_itr,tol)}, where \texttt{x}`
			`and \texttt{rvec} are, respectively, the solution value and a`
			`vector containing the residual at every iteration.}`

			`The implementation can be found in file \texttt{myCG.m} in the source directory.`
			`The test code for the function \texttt{myCG} can be found in the \texttt{test.m} file.`

			`\subsection{ In order to validate your implementation, solve the system`
			`defined by \texttt{A\_test.mat} and \texttt{b\_test.mat}. Plot`
			`the convergence (residual vs iteration).}`

			`The plot of the squared residual 2-norms over all iterations can be found in Figure`
			`\ref{fig:plot1}.`

			`\begin{figure}[h]`
mp5: done 2020-11-30 20:02:20 +00:00			`\centering`
			`\resizebox{0.6\textwidth}{!}{\input{test_semilogy}}`
mp5: done 1-3 2020-11-25 14:20:13 +00:00			`\caption{Semilog plot of the plot of the squared residual 2-norms over all iterations}\label{fig:plot1}`
			`\end{figure}`

			`\subsection{ Plot the eigenvalues of \texttt{A\_test.mat} and comment on the`
			`condition number and convergence rate.}`

			`The eigenvalues of A can be found in figure`
mp5: 3.3 correction 2020-12-02 12:27:51 +00:00			`\ref{fig:plot2}.`
mp5: done 1-3 2020-11-25 14:20:13 +00:00
			`\begin{figure}[h]`
mp5: done 2020-11-30 20:02:20 +00:00			`\centering`
			`\resizebox{0.6\textwidth}{!}{\input{A_eig}}`
mp5: done 1-3 2020-11-25 14:20:13 +00:00			`\caption{Semilog plot of the eigenvalues of A}\label{fig:plot2}`
			`\end{figure}`

mp5: 3.3 correction 2020-12-02 12:27:51 +00:00			`The condition number for matrix $A$ according to \texttt{cond(...)} is $\approx 1.3700 \cdot 10^6$,`
			`which is rather ill conditioned. The eigenvalue plot agrees with the condition number, by showing that there are significant differences in the magnitude of the eigenvalues of \texttt{A\_test}.`

			`The two methods agree due to the fact that the condition number is computed by using singular values,`
			`which in turn are derived by eigenvalues. This fact was demonstrated practically by Dr. Edoardo Vecchi, \textit{future PhD}`
			`using this MATLAB snippet.`

			`\begin{verbatim}`
			`X = A_test'*A_test;`
			`singular_values = sqrt(eig(X));`
			`norm(sort(singular_values,'descend') - svd(A_test))/numel(A_test)`
			`\end{verbatim}`

			`The final norm is less than $10^{12}$, showning that the definition of`
			`\texttt{singular\_values}, which is a generalization of the computation of`
			`eigenvalues for rectangular matrices, is equivalent to MATLAB's \texttt{svd(...)} other than approximation errors.`


mp5: done 1-3 2020-11-25 14:20:13 +00:00			`\section{Debluring problem [40 points]}`
			`\subsection{ Solve the debluring problem for the blurred image matrix`
			`\texttt{B.mat} and transformation matrix \texttt{A.mat} using`
			`your routine \texttt{myCG} and Matlab's preconditioned conjugate`
			`gradient \texttt{pcg}. As a preconditioner, use \texttt{ichol}`
			`to get the incomplete Cholesky factors and set routine type to`
			`\texttt{nofill} with $\alpha=0.01$ for the diagonal shift (see`
			`Matlab documentation). Solve the system with both solvers using`
			`$max\_iter=200$ $tol= 10^{-6}$. Plot the convergence (residual`
			`vs iteration) of each solver and display the original and final`
			`deblurred image.}`
mp5: annotated 4.2 solution 2020-11-29 22:11:28 +00:00
mp5: done 2020-11-30 20:02:20 +00:00			`My implementation is in the file \texttt{deblurring.m}. Plots for the original image, the deblurred image from`
			`\texttt{myCG}, the deblurred image from \texttt{pcg}, and a semi-logarithmic plot on the y-axis of the residuals`
			`from the two conjugate gradient functions over the iteration count can be found respectively in figure \ref{fig:orig},`
			`\ref{fig:mycg}, \ref{fig:pcg}, and \ref{fig:rvec}.`
mp5: annotated 4.2 solution 2020-11-29 22:11:28 +00:00
mp5: correction to terminating condition of myCG 2020-12-01 12:05:29 +00:00			`As a terminating condition for the \texttt{myCG} implementation, I chose to check if the residual divided by the 2-norm of`
			`$b$ is less than the given tolerance. This mimicks the behaviour of MATLAB's \texttt{pcg} to allow for a better comparison.`

mp5: done 2020-11-30 20:02:20 +00:00			`\begin{figure}[h]`
			`\begin{subfigure}{0.33\textwidth}`
			`\centering`
			`\resizebox{0.8\textwidth}{!}{\input{img_orig}}`
			`\caption{Original image grayscale matrix}\label{fig:orig}`
			`\end{subfigure}`
			`\begin{subfigure}{0.33\textwidth}`
			`\centering`
			`\resizebox{0.8\textwidth}{!}{\input{img_my}}`
			`\caption{Deblurred image using \texttt{myCG}}\label{fig:mycg}`
			`\end{subfigure}`
			`\begin{subfigure}{0.33\textwidth}`
			`\centering`
			`\resizebox{0.8\textwidth}{!}{\input{img_rcg}}`
			`\caption{Deblurred image using \texttt{rcg}}\label{fig:pcg}`
			`\end{subfigure}`
			`\caption{Blurred and deblurred images}`
			`\end{figure}`
			`\begin{figure}[h]`
			`\centering`
			`\resizebox{0.6\textwidth}{!}{\input{res_log}}`
			`\caption{Residuals of \texttt{myCG} (in blue) and \texttt{rcg} (in orange) over iteration count (y axis is a log scale)}\label{fig:rvec}`
			`\end{figure}`
mp5: done 1-3 2020-11-25 14:20:13 +00:00			`\subsection{ When would \texttt{pcg} be worth the added computational cost?`
			`What about if you are debluring lots of images with the same`
			`blur operator?}`

mp5: done 2020-11-30 20:02:20 +00:00			`The \textit{pcg} algorithm provided by MATLAB would be worth for many deblurring operations useing the same blur operator, since the cost of computing the incomplete cholesky decomposition (i.e. \texttt{ichol}) can be payed only once and thus amortized. \texttt{myCG} is better for few iterations thanks to not needing any seed that is expensive to compute.`
mp5: annotated 4.2 solution 2020-11-29 22:11:28 +00:00
mp5: done 1-3 2020-11-25 14:20:13 +00:00
			`\end{document}`