hw2: paraculation

This commit is contained in:
Claudio Maggioni (maggicl) 2021-04-11 16:36:51 +02:00
parent 854dcc8d4a
commit f29953dfac
2 changed files with 12 additions and 33 deletions

View file

@ -55,25 +55,24 @@ The matrix $A$ and the vector $b$ appear in the following form:
0 & 0 & 0 & 0 & \ldots & 1 \\
\end{bmatrix}\;\;\;b = \begin{bmatrix}0\\h^2\\h^2\\\vdots\\0\end{bmatrix}\]
For $N = 4$, we can attempt to build a minimizer to the solve the system $Ax = b$. In order to find an $x$ such that $Ax = b$, we could define a minimizer $\phi(x)$ such that:
For $N = 4$, we have the following $A$ and $b$:
\[\phi(x) = \|b - Ax\|^2\]
\[A = \begin{bmatrix}
1 & 0 & 0 & 0 \\
-1 & 2 & -1 & 0 \\
0 & -1 & 2 & -1 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}\;\;\;b = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\]
Here $\phi(x) = 0$ would mean that $x$ is an exact solution of the system $Ax = b$. We can then attempt to write such minimizer for $N = 4$:
In order to solve the minimization problem we need to minimize the energy function:
\[\phi(x) = \|b - Ax\|^2 = \left|\begin{bmatrix}0 - x_1\\\frac19 -x_1 + 2x_2 -x_3\\
\frac19 -x_2 +2x_3 -x_4\\0 - x_4\end{bmatrix}\right|^2 =\]\[= x_1^2 + \left(\frac19 - x_1 + 2x_2 - x_3\right)^2 + \left(\frac19 - x_2 + 2x_3 - x_4\right)^2 + x_4^2\]
\[\phi(x) = \frac12 x^T A x - b^T x\]
\[\Delta \phi(x) = \begin{bmatrix}4x_1 - 4x_2 + 2x_3 -\frac29\\
-4x_1 +10x_2 -8x_3 + 2x_4 +\frac29\\
2x_1 -8x_2 +10x_3 + -4x_4 +\frac29\\
2x_2 - 4x_3 + 4x_4 -\frac29\end{bmatrix}\;\;\;\Delta^2 \phi(x) = \begin{bmatrix}4&-4&2&0\\-4&10&-8&2\\2&-8&10&-4\\0&2&-4&4\end{bmatrix}\]
Computing the gradient of the minimizer, and considering that $A$ is clearly not symmetric as shown above, we find:
As it can be seen from the Hessian calculation, the Hessian is positive definite forall $x$s. This means, by the sufficient condition of minimizers, that we can find a minimizer by solving $\Delta \phi(x) = 0$ (i.e. finding stationary points in the hyperspace defined by $\phi(x)$. Solving that, we find:
\[\Delta \phi(x) = \frac12A^T x + \frac12Ax - b\]
\[x = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\;\;\]
which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function. Although the $\phi(x)$ given here is not strictly in a matrix vector product quadratic form, it is indeed a valid energy function and for different values of $N$ similar fashioned $\phi(x)$ could be derived. Therefore, we can say that the problem has an energy function.
If $A$ would be symmetric, $\Delta \phi(x)$ is equal to $Ax - b$ and therefore the semantic equivalence between this energy function and the solution of $Ax = b$ is straightforward. However, if $A$ is not symmetric then this does not hold and an enegry function therefore does not exist.
\subsection{Once the new matrix has been derived, write the energy function related to the new problem
and the corresponding gradient and Hessian.}
@ -138,26 +137,6 @@ To sum up our claim, we can say CG is indeed a Krylov subspace method because:
These statements have been already proven in class and the proof can be found in Theorem 5.3 of Nocedal's book.
% These statements can be proven by induction over $k$. The base case holds trivially for $k=0$, while we have by the induction hypothesis for any $k$ that:
% \[r_k \in \text{span}\{r_0, A r_0, \ldots, A^k r_0\}\;\;\; p_k \in \text{span}\{r_0, A r_0, \ldots, A^k r_0\}\]
% We'd like to prove that the two properties hold from $k+1$ starting from the hypothesis on $k$. We first multiply the first hypothesis by $A$ from the left:
% \[A r_k \in \text{span}\{A r_0, A^2 r_0, \ldots, A^{k+1} r_0\}\]
% By the alternative definition of the residual for the CG method (i.e. $r_{k+1} = r_k + \alpha_k A p_k$), we find that:
% \[r_{k+1} \in \text{span}\{r_0, A r_0, A^2 r_0, \ldots, A^{k+1} r_0\}\]
% We need to add $r_0$ in the span again since one of the components that defines $r_1$ is indeed $r_0$. We don't need to add other terms in the span since $r_1$ to $r_k$ are in the span already by the induction hypothesis.
% Combining this expression with the induction hypothesis we prove induction of the first statement by having:
% \[\text{span}\{r_0, r_1, \ldots, r_n, r_{n+1}\} \subseteq \text{span}\{r_0, A r_0, \ldots, A^k r_0, A^{k+1} r_0\}\]
% To prove $\supseteq$ as well to achieve equality, we use the induction hypothesis for the second statement to find that $A^kr_0 \in \text{span}\{A$:
\section{Exercise 2}
Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways: