diff --git a/mp1/template.pdf b/mp1/template.pdf index 623d90f..2571f25 100644 Binary files a/mp1/template.pdf and b/mp1/template.pdf differ diff --git a/mp1/template.tex b/mp1/template.tex index b753bee..985b58e 100644 --- a/mp1/template.tex +++ b/mp1/template.tex @@ -22,11 +22,67 @@ The purpose of this assignment\footnote{This document is originally based on a S \subsection{Theory [20 points]} +\subsubsection{Show that the order of convergence of the power method is linear, +and state what the asymptotic error constant is.} + +First of all, we show the the sequence of vectors computed by power iteration +indeed converges to $\lambda_1$ or the biggest eigenvector (we assume we name +eigenvectors in decreasing order of magnitude, with $|\lambda_1| > |\lambda_i|$ +for $i \in 2..n$). + +We can express the seed for the eigenvector (i.e. the initial value of $v$ of +the power iteration) as a linear combination of eigenvalues: + +\[v_0 = \sum_{i=1}^n a_i x_i\] + +We can then express the result of the n-th power method as + +\[v_n = \gamma A v_{n-1} = A^n v_0 = \sum_{i=1}^n \gamma a_i \lambda_i^n x_i = +\lambda_1^n \sum_{i=1}^n \gamma a_i \left( \frac{\lambda_i}{\lambda_1} \right)^n x_i = +\gamma a_1 \lambda_1^n x_1 + \lambda_1^n \sum_{i=2}^n \gamma a_i +\left(\frac{\lambda_i}{\lambda_1}\right)^n x_i \] + +Here, $\gamma$ is just a normalization term to make $||v_n|| = 1$. $v_n$ clearly +converges to $x_1$ since all the terms in the $\sum_{i=2}^n$ contain +$\frac{\lambda_i}{\lambda_1}$, which is always less than 0 if $i > 1$ for the +sorting of eigenvalues we did before. Therefore, these terms to the power of n +converge to 0, and $\gamma$ will cancel out $a_1 \lambda_1^k$ due to the +normalization, thus making the sequence converge to $\lambda_1$. + +To see if the sequence converges linearly we use the definitions of rate of +convergence: + +\[\lim_{n \to \infty}\frac{|x_{n+1} - \lambda_1|}{|x_n - \lambda_1|^1} = \mu\] + +If this limit has a finite solution then the sequence converges linearly with +rate $\mu$. + +\[\lim_{n \to \infty}\frac{\left| a_1 \lambda_1^{n+1} x_1 + \lambda_1^{n+1} +\sum_{i=2}^n a_i \left(\frac{\lambda_i}{\lambda_1}\right)^{n+1} +x_i - \beta_{n+1} x_1\right|} +{\left| a_1 \lambda_1^n x_1 + \lambda_1^n \sum_{i=2}^n a_i +\left(\frac{\lambda_i}{\lambda_1}\right)^n x_i - \beta_n x_1\right|^1} = \mu\] + +To simplify calculations, we consider the sequence without the normalization +factor $\gamma$ that will converge to a denormalized version of $x_1$, named +$\beta x_1$. We can then simplify the $a_1\lambda_1^{i}x_1$ terms in the +sequences with $\beta_{i} x_1$ since $\beta_i$ can be set freely. + +Now we consider that if $|\lambda_2| > |\lambda_i| \forall i \in 3..n$ (since we +sorted the eigenvalues), then +$\left(\frac{\lambda_i}{\lambda_1}\right)^n$ for $i > 2$ will always converge faster to +0 than $\left(\frac{\lambda_2}{\lambda_1}\right)^n$ thus all terms other than +$i=2$ can be ignored in the limit computation. Therefore, the limit has finite +solution and the convergence rate +is + +\[\mu = \frac{\lambda_2}{\lambda_1}\] + \subsubsection{What assumptions should be made to guarantee convergence of the power method?} The first assumption to make is that the biggest eigenvalue in terms of absolute values should (let's name it $\lambda_1$) be strictly greater than all other eigenvectors, so: -$$|\lambda_1| < |\Lambda_i| \forall i \in \{2..n\}$$ +\[|\lambda_1| < |\lambda_i| \forall i \in \{2..n\}\] Also, the eigenvector \textit{guess} from which the power iteration starts must have a component in the direction of $x_i$, the eigenvector for the eigenvalue $\lambda_1$ from before. @@ -38,11 +94,11 @@ The shift and invert approach is a variant of the power method that may signific where $\alpha$ is an arbitrary constant that must be chosen wisely in order to increase the rate of convergence. Since the eigenvalues $u_i$ of B can be derived from the eigenvalues $\lambda_i$ of A, namely: -$$u_i = \frac{1}{\lambda_i - \alpha}$$ +\[u_i = \frac{1}{\lambda_i - \alpha}\] the rate of convergence of the power method on B is: -$$\left|\frac{u_2}{u_1}\right| = \left|\frac{\frac1{\lambda_2 - \alpha}}{\frac1{\lambda_1 - \alpha}}\right| = \left|\frac{\lambda_1 - \alpha}{\lambda_2 - \alpha}\right|$$ +\[\left|\frac{u_2}{u_1}\right| = \left|\frac{\frac1{\lambda_2 - \alpha}}{\frac1{\lambda_1 - \alpha}}\right| = \left|\frac{\lambda_1 - \alpha}{\lambda_2 - \alpha}\right|\] By choosing $\alpha$ close to $\lambda_1$, the convergence is sped up. To further increase the rate of convergence (up to a cubic rate), a new $\alpha$, and thus a new $B$, may be chosen for every iteration.