hw2: paraculation

This commit is contained in:
Claudio Maggioni (maggicl) 2021-04-11 16:36:51 +02:00
parent 854dcc8d4a
commit f29953dfac
2 changed files with 12 additions and 33 deletions

View file

@ -55,25 +55,24 @@ The matrix $A$ and the vector $b$ appear in the following form:
0 & 0 & 0 & 0 & \ldots & 1 \\ 0 & 0 & 0 & 0 & \ldots & 1 \\
\end{bmatrix}\;\;\;b = \begin{bmatrix}0\\h^2\\h^2\\\vdots\\0\end{bmatrix}\] \end{bmatrix}\;\;\;b = \begin{bmatrix}0\\h^2\\h^2\\\vdots\\0\end{bmatrix}\]
For $N = 4$, we can attempt to build a minimizer to the solve the system $Ax = b$. In order to find an $x$ such that $Ax = b$, we could define a minimizer $\phi(x)$ such that: For $N = 4$, we have the following $A$ and $b$:
\[\phi(x) = \|b - Ax\|^2\] \[A = \begin{bmatrix}
1 & 0 & 0 & 0 \\
-1 & 2 & -1 & 0 \\
0 & -1 & 2 & -1 \\
0 & 0 & 0 & 1 \\
\end{bmatrix}\;\;\;b = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\]
Here $\phi(x) = 0$ would mean that $x$ is an exact solution of the system $Ax = b$. We can then attempt to write such minimizer for $N = 4$: In order to solve the minimization problem we need to minimize the energy function:
\[\phi(x) = \|b - Ax\|^2 = \left|\begin{bmatrix}0 - x_1\\\frac19 -x_1 + 2x_2 -x_3\\ \[\phi(x) = \frac12 x^T A x - b^T x\]
\frac19 -x_2 +2x_3 -x_4\\0 - x_4\end{bmatrix}\right|^2 =\]\[= x_1^2 + \left(\frac19 - x_1 + 2x_2 - x_3\right)^2 + \left(\frac19 - x_2 + 2x_3 - x_4\right)^2 + x_4^2\]
\[\Delta \phi(x) = \begin{bmatrix}4x_1 - 4x_2 + 2x_3 -\frac29\\ Computing the gradient of the minimizer, and considering that $A$ is clearly not symmetric as shown above, we find:
-4x_1 +10x_2 -8x_3 + 2x_4 +\frac29\\
2x_1 -8x_2 +10x_3 + -4x_4 +\frac29\\
2x_2 - 4x_3 + 4x_4 -\frac29\end{bmatrix}\;\;\;\Delta^2 \phi(x) = \begin{bmatrix}4&-4&2&0\\-4&10&-8&2\\2&-8&10&-4\\0&2&-4&4\end{bmatrix}\]
As it can be seen from the Hessian calculation, the Hessian is positive definite forall $x$s. This means, by the sufficient condition of minimizers, that we can find a minimizer by solving $\Delta \phi(x) = 0$ (i.e. finding stationary points in the hyperspace defined by $\phi(x)$. Solving that, we find: \[\Delta \phi(x) = \frac12A^T x + \frac12Ax - b\]
\[x = \begin{bmatrix}0\\\frac19\\\frac19\\0\end{bmatrix}\;\;\] If $A$ would be symmetric, $\Delta \phi(x)$ is equal to $Ax - b$ and therefore the semantic equivalence between this energy function and the solution of $Ax = b$ is straightforward. However, if $A$ is not symmetric then this does not hold and an enegry function therefore does not exist.
which is indeed the minimizer and solution of $Ax = b$. Therefore, $\phi(x)$ is a valid energy function. Although the $\phi(x)$ given here is not strictly in a matrix vector product quadratic form, it is indeed a valid energy function and for different values of $N$ similar fashioned $\phi(x)$ could be derived. Therefore, we can say that the problem has an energy function.
\subsection{Once the new matrix has been derived, write the energy function related to the new problem \subsection{Once the new matrix has been derived, write the energy function related to the new problem
and the corresponding gradient and Hessian.} and the corresponding gradient and Hessian.}
@ -138,26 +137,6 @@ To sum up our claim, we can say CG is indeed a Krylov subspace method because:
These statements have been already proven in class and the proof can be found in Theorem 5.3 of Nocedal's book. These statements have been already proven in class and the proof can be found in Theorem 5.3 of Nocedal's book.
% These statements can be proven by induction over $k$. The base case holds trivially for $k=0$, while we have by the induction hypothesis for any $k$ that:
% \[r_k \in \text{span}\{r_0, A r_0, \ldots, A^k r_0\}\;\;\; p_k \in \text{span}\{r_0, A r_0, \ldots, A^k r_0\}\]
% We'd like to prove that the two properties hold from $k+1$ starting from the hypothesis on $k$. We first multiply the first hypothesis by $A$ from the left:
% \[A r_k \in \text{span}\{A r_0, A^2 r_0, \ldots, A^{k+1} r_0\}\]
% By the alternative definition of the residual for the CG method (i.e. $r_{k+1} = r_k + \alpha_k A p_k$), we find that:
% \[r_{k+1} \in \text{span}\{r_0, A r_0, A^2 r_0, \ldots, A^{k+1} r_0\}\]
% We need to add $r_0$ in the span again since one of the components that defines $r_1$ is indeed $r_0$. We don't need to add other terms in the span since $r_1$ to $r_k$ are in the span already by the induction hypothesis.
% Combining this expression with the induction hypothesis we prove induction of the first statement by having:
% \[\text{span}\{r_0, r_1, \ldots, r_n, r_{n+1}\} \subseteq \text{span}\{r_0, A r_0, \ldots, A^k r_0, A^{k+1} r_0\}\]
% To prove $\supseteq$ as well to achieve equality, we use the induction hypothesis for the second statement to find that $A^kr_0 \in \text{span}\{A$:
\section{Exercise 2} \section{Exercise 2}
Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways: Consider the linear system $Ax = b$, where the matrix $A$ is constructed in three different ways: