Random projections

Let $A\in\R\nm$, where $m\gg n$. We wish to find a matrix $C$ whose range is approximately the range of $A$, i.e. $A\approx P_{\range(C)}(A)$

Theorem. $~$Suppose $A\in\R\nm$, $C$ be as in the algorithm. Let $H$ be the $k$ left singular vectors of $C$. Then $$\norm{A-HH'A}{}_F^2\leq\norm{A-A_k}{}_F^2+2\sqrt k\norm{AA'-CC'}{}_F$$ where $A_k$ is the rank-$k$ approximation to $A$.

Proof. \begin{align*}
\norm{A-HH'A}{}_F^2&=\tr\big((A-HH'A)'(A-HH'A)\big)\\
&=\tr\big(A'A-2A'HH'A+A'HH'HHA\big)\\
&=\norm A{}_F^2-\norm{A'H}{}_F^2\\
&=\Big(\norm{A-A_k}{}_F^2+\sum_{j=1}^k\sigma_j^2(A)\Big)-\norm{A'H}{}_F^2\\
&=\norm{A-A_k}{}_F^2+\underbrace{\sum_{j=1}^k\Big[\sigma_j^2(A)-\sigma_j^2(C)\Big]}_{{\LARGE①}}+\underbrace{\sum_{j=1}^k\sigma_j^2(C)-\norm{A'H}{}_F^2}_{{\LARGE②}}
\end{align*} \begin{align*}
{\Large①}\,&\leq\sqrt k\sqrt{\sum_{j=1}^k\Big(\sigma_j^2(A)-\sigma_j^2(C)\Big)^2}\\
&\leq\sqrt k\sqrt{\sum_{j=1}^k\Big(\sigma_j(AA')-\sigma_j(CC'))\Big)^2}\\
&\leq\sqrt k\sqrt{\sum_{j=1}^k\Big(\sigma_j(CC'+AA'-CC')-\sigma_j(CC'))\Big)^2}\\
&\leq\sqrt k\norm{AA'-CC'}F
\end{align*} Next, let $H_1,..,H_k$ denote the columns of $H$.
\begin{align*}
{\Large②}\,&=\sum_{j=1}^k\sigma_j^2(C)-\sum_{j=1}^k\norm{A'H_j}{}_2^2\\
&\leq\sqrt k\sqrt{\sum_{j=1}^k\Big(\sigma_j(CC')-H_j'AA'H_j\Big)^2}\\
&\leq\sqrt k\sqrt{\sum_{j=1}^k\Big(H_j'CC'H_j-H_j'AA'H_j\Big)^2}\\
&\leq\sqrt k\norm{AA'-CC'}F
\end{align*}

HOMEWORK $~$Let $Q$ be orthogonal and let $q_1,..,q_n$ be its columns. Let $Z\in\R\nn$ be symmetric. Then, $\sum_{i=1}^n(q_i'Zq_i)^2\leq\norm Z{}_F^2$. $$~$$

Iterative methods

References

"Matrix Computations" , Golub & Van Loan, 1996
"Iterative Methods for Sparse Linear Systems" , Saad, 2000
"An Introduction to the Conjugate Gradient Method Without the Agonizing Pain" , Shewchuk, 1994
"Randomized Iterative Methods for Linear Systems" , Gower & Richtárik, 2015

Splitting methods

Let $x^c$ and $r^c=Ax^c-b$ denote the current iterate and residual, and $x^+$ and $r^+=Ax^+-b$ denote the next iterate and residual.

$A=D-E-F$ where $D$ is the diagonal entries of $A$, and $E$ and $F$ are the negative lower and upper triangular parts of $A$ respectively (with no diagonal).

Jacobi Method
Basic idea: start with a guess for, then for each $x_i$ solve in terms of other variables and plug in the guess to obtain the next guess, and repeat. Let $\br{e_1}$ be the standard basis in $\R^d$. The basic idea is to set $r_i^+$ to $0$. $$0=r_i^+=b_i-\!\!\!\sum_{k=1,\,k\neq i}^dA_{ik}x_{k}^c-A_{ii}x_i^+$$ $$\implies~~x_i^+=\frac1{A_{ii}}\pr{b_i-\!\!\!\sum_{k=1,\,k\neq i}^dA_{ik}x_{k}^c}$$ $$\implies~~x^+=D\inv\Big(b+(E+F)x^c\Big)$$

Gauss-Seidel
Basic idea: unlike Jacobi, where the current guess is entirely calculated from the result of the previous iteration, Gauss-Seidel calculates the element of each guess based on the previous iteration and the entries calculated already in the current iteration. $$x_i^+=A_{ii}^{-1}\pr{b_i-\sum_{k=1}^{i-1}A_{ik}x_k^+-\sum_{k=i+1}^dA_{ik}x_k^c}$$ $$\implies x^+=D\inv(b+Ex^++Fx^c)=(D-E)\inv(b+FX^c)\\$$

Successive Over Relaxation \begin{align*}
&{\small\text{Forwards:}}&&&&(D-wE)x^+=\Big(wF+(1-w)D\Big)x^c+wb\\
&{\small\text{Backwards:}}&&&&(D-wF)x^+=\Big(wE+(1-w)D\Big)x^c+wb\\
&{\small\text{Symmetric:}}&&&&(D-wE)z\,\,\,\:=\Big(wF+(1-w)D\Big)x^c+wb\\
&&&&&(D-wE)x^+=\Big(wF+(1-w)D\Big)z\,\,\,+wb
\end{align*}

HOMEWORK $~$ What do we need from $A$ so that these difference schemes are well-defined? $$~$$

Convergence $~~X^+=Gx^c+f$

Jacobi: $~G=D\inv(E+F)$, $~f=D\inv b$
GS: $~~~~~~G=(D-E)\inv F$, $~f=(D-E)\inv b$
SOR: $~~~\:G=(D-wE)\inv(wF+(1-w)D)$, $~f=w(D-wE)\inv b$

One can check that in each case, $f=(I-G)x^*$ for any $x^*$ such that $Ax^*=b$.

Lemma. $~$ For Jacobi, GS, SOR, if $\exists x^*$ s.t. $Ax^*=b$, then $x^+\!-x^*=G(x^c\!-x^*)$.
Proof. $~~x^+\!=Gx^c\!+f=Gx^c\!+(I-G)x^*=Gx^c\!+x^+\!-Gx^*$

Theorem. $~$Suppose $\exists x^*$ s.t. $Ax^*=b$. Let $x_0$ be arbitrary, and define $\pr{x_k:k\in\N}$ where $x_k=Gx_{k-1}+f$. If $\rho(G)<1$ then $x^*$ is unique and $x_k\to x^*$, where $\rho(G)$ is the spectral radius, or the largest absolute value of its eigenvalues.
Proof. $~$ HOMEWORK