Singular Value Decomposition

Problems

  1. $\initialize$Minimum norm linear system solution. Let $A\in\R\nm$ with $\,\rank(A)<m\leq n$ and let $b\in\range(A)$. Find $x$ s.t. $$x\in\argmin_{y\in\R^m}\br{\norm y2:Ay=b}$$
  2. Matrix $2$-norm. Let $A\in\R\nm$. Find $$\norm A2:=\sup_{v\in\R^m\backslash\lbrace0\rbrace}\frac{\norm{Av}2}{\norm{v}2}$$
  3. Low rank matrix approximation (data compression). Let $A\in\R\nm$. Find $x$ s.t. $$\argmin_{\rank(y)\leq k}\norm{A-y}F$$ where $\norm{\cdot}F$ indicates the Frobenius norm, i.e. square root of the sum of the squares of each element in the matrix. $$~$$

Theorem. Suppose $A\in\R\nm$ and $n\geq m$. Then $\exists U\in\R\nn$, $V\in\R\mm$ that are orthogonal, and diagonal $\Sigma$ s.t. $$\Sigma=\left[\begin{matrix}\begin{matrix}\sigma_1\\&\ddots\\&&\sigma_m\end{matrix}\\{\Huge 0\vphantom{\Bigg|^|_|}}\end{matrix}\right]$$ where $\sigma_1\geq\sigma_2\geq...\geq\sigma_m\geq0$ s.t. $A=U\Sigma V'$.

HOMEWORK

  1. What happens if $m>n$?
  2. If u,v are left and right singular vectors corresponding to singular value $\sigma$ show that $Av=\sigma u$ and $A'u=\sigma v$
    Corollary : $\rank(A)$ is equal to the number of nonzero singular values. $$~$$

$$~$$

Solutions

  1. We can solve $Ay=b$ by finding $U\Sigma V'=A$, then define $$U\Sigma\underbrace{V'y}_{z}=b~~~~~\text{and let}~~~~U'b=c$$ $$\text{then}~~\Sigma z=c~~\implies~~\left[\begin{matrix}\begin{matrix}\sigma_1\\&\ddots\\&&\sigma_r\\&&&0\\&&&&\ddots\\&&&&&0\end{matrix}\\{\Huge 0\vphantom{\Bigg|^|_|}}\end{matrix}\right]\mat{\begin{matrix}~\\z_1\in\R^r\\~\\~\\z_2\in\R^{m-r}\\~\\~\\{\vphantom{\Huge 0\Bigg|^|_|}}\end{matrix}}=\mat{~\\c_1\\~\\0\\~}$$
    where the product of the non-zero singular values with $z_1$ gives the vector $c_1$. Then, $$\norm z2=\sqrt{\vphantom{\norm{z_1}{}_2^2+\norm{z_2}{}_2^2}\smash{\norm{z_1}{}_2^2+\underbrace{\norm{z_2}{}_2^2}_{0}}}$$
    and we have $$y~=~V\,\mat{\begin{matrix}\begin{matrix}\mat{\sigma_1^{-1}\\&\ddots\\&&\sigma_r^{-1}}&c_1\end{matrix}\\{\Huge 0\vphantom{\Bigg|^|_|}}\end{matrix}}\space{10pt}$$

  2. First, a lemma $\space{12pt}$
    Lemma. $~$ Let $D$ be a nonzero diagonal matrix of size $n\geq m$. Then $\norm D2=\max\br{|D_{11}|}$
    Proof. $~$ Note $\norm A2=\max_{\norm v2=1}\br{\norm{Av}2}$. Then, $$\norm{Dv}{}_2^2=\sum_{j=1}^m(D_{jj}v_j)^2\leq\max_{i}\br{D_{ii}^2\sum_{j=1}^mv_j^2}=\max_{i}\br{|D_{ii}|^2}$$ Now, let $z$ such that $z_j=\delta_{i,j}$. Then, $\norm{Dx}{}_2^2=\sum_j(D_{ij}z_j)^2=D_{ii}^2\,$, $~$so we have $|D_{ii}|=\norm{Dz}2\leq\norm{D}2\leq|D_{ii}|$ which implies $\norm D2=\max_i\br{|D_{ii}|}$ as desired. $\space{12pt}$
    Now, we can use SVD to compute $\norm A2$.
    \begin{align*}
    \norm A2&=\max_{\norm v2=1}\Big\|U\Sigma\underbrace{V'v}_{z}2\Big\|_2\\
    &=\max_{\norm z2=1}\norm{\Sigma z}2\\
    &=\sigma_1
    \end{align*}

  3. Low rank matrix approcimation $A=\sum_{i=1}^r\sigma_iu_iv_i'=U\Sigma V'\space{6pt}$
    HOMEWORK Show that $\norm A{}_F^2=\sum_{i=1}^n\sum_{j=1}^mA_{ij}^2$ is orthogonally invariant, i.e. $\norm{QA}F=\norm AF$ and $\norm{AP}F=\norm AF\space{6pt}$
    We wish to find $$\min_{\rank(y)\leq k}\norm{A-Y}F$$
    Case 1: $~~\rank(A)\leq k$, $Y=A\space{6pt}$
    Case 2: $~~\rank(A)>k$

$$~$$

Existence

Theorem. Let $A\in\R\nm$, $n\geq m$, then there exist
- Orthogonal $U\in\R\nn$
- Orthogonal $V\in\R\mm$
- Diagonal $\Sigma\in\R\nm$, $\Sigma_{i-j}=\sigma_i$ if $i=j$ and $0$ otherwise (i.e. only non-zero on main diagonal), with $\sigma_1\geq\sigma_2\geq\cdots\geq\sigma_m\geq0$ s.t. $$A=U\Sigma V'=\sum_{i=1}^m\sigma_iU_iV_i'$$

Recall that $\norm A2=\sup_{v\neq0}\frac{\norm{Av}2}{\norm v2}=\max_{\norm v2=1}\norm{Av}2\,$. Let $v_1$ s.t. $\norm A2=\norm{Av_1}2$ and $\,u_1$ s.t. $\norm A2u_1=Av_1\,$, and let $\sigma_1=\norm A2\,$.

Now, extend $u_1$ and $v_1$ to two orthonormal bases $\tilde{U}_1=\mat{u_1&U_1}\in\R\nn\,$ and $\,\tilde{V}_1=\mat{v_1&V_1}\in\R\mm\,$, then we have
\begin{align*}
\tilde{U}_1A\tilde{V}_1&=\mat{u_1'\\[6pt]U_1'}\,A\,\mat{v_1&\!\!V_1}\\[6pt]
&=\mat{u_1'\\[6pt]U_1'}\,\mat{\sigma_1u_1&\!\!AV_1}\\[6pt]
&=\mat{\sigma_1&w'\\0&\tilde{A}}
\end{align*}

where $w'$ is a $1\times(m\!-\!1)$ row vector, and $\tilde{A}$ is a $(n\!-\!1)\!\times\!(m\!-\!1)$ matrix. Now, we will use this relation derive an inequality to prove $w'=0$. First, recall that

\begin{align*}
\sigma_1^2=\norm A{}_2^2&=\norm{\tilde{U}_1'A\tilde{V}_1}{}_2^2\\[4pt]
&=\norm{\mat{\,\sigma_1&w'\\0&\tilde{A}}\vphantom{\mat{\\~}_|^|}}{}_2^2\\[4pt]
&\geq\,\frac{\norm{\mat{\,\sigma_1&w'\\0&\tilde{A}}z\,\vphantom{\mat{\\~}_|^|}}{}_2^2}{\norm z{}_2^2}
\end{align*}

Where the last inequality follows since the norm $\sigma_1$ is the supremem of the expression. Let $z=\mat{\sigma_1\\w}$ and note that

\begin{align*}
\sigma_1^2\,&\geq\,\frac{\norm{\mat{\,\sigma_1&w'\\0&\tilde{A}}\mat{\sigma_1\\w}\vphantom{\mat{\\~}_|^|}}{}_2^2}{\sigma_1^2+\norm w{}_2^2}\\[6pt]
&=\frac{{\Big(\sigma_1^2+\norm w{}_2^2\Big)\!}^2+\norm{\tilde{A}w}{}_2^2}{{\sigma_1^2+\norm w{}_2^2}}\\[6pt]
&\geq\,\sigma_1^2+\norm w{}_2^2
\end{align*}

Which $\norm w{}_2^2=0$ which implies $w=0$ as desired.$\space{12pt}$

Properties

However, it's important to note this is only an existence proof, since we don't know what $v_1$ is.

Corollary. $~\rank(A)=\text{# of nonzero singular values}$
Proof. $~\rank(A)=\rank(U\Sigma V')=\rank(\Sigma)$

Corollary. $~$Let $A,E\in\R\nm$, let $\sigma_\max$, $\sigma_\min$ be the largest/smallest singular values of $A$
\begin{align*}
\sigma_\max(A+E)&\leq\sigma_\max(A)+\norm E2\\
\sigma_\min(A+E)&\geq\sigma_\min(A)-\norm E2
\end{align*} Proof. $~$ HOMEWORK

Corollary. $~$(Hoffman-Wielandt Inequality)
Let $A,E\in\R\nm$, let $\sigma_k(\cdot)$ denote the $k$-th largest singular value function, and let $p=\min(m,n)$. Then, $$\sum_{k=1}^p\Big(\sigma_k(A+E)-\sigma_k(A)\Big)^2\!\leq\norm E{}_F^2$$ Proof. $~$ HOMEWORK

Corollary. $~$Let $r=\rank(A)$, and $P_\text{space}$ denote the project onto a space. Then,
$$\begin{matrix}\begin{aligned}
\range(A)&=\span(u_1,....,u_r)\\
\row(A)&=\span(v_1,....,v_r)\\
\nul(A)&=\span(v_{r+1},..,v_m)\\
&=\span(u_{r+1},..,u_n)
\end{aligned}
&&
\begin{aligned}{}
[u_1\,\cdots\cdots\,u_r]\,[u_1\,\cdots\cdots\,u_r]'&=P_{\range(A)}\\
[u_{r+1}\,\cdots\,u_n]\,\,[u_{r+1}\,\cdots\,\,u_n]'&=P_{\nul{A})}\\
[v_1\,\cdots\cdots\,v_r]\,\,[v_1\,\cdots\cdots\,v_r]'&=P_{\row(A)}\\
[v_{r+1}\,\cdots\,v_m]\,\,[v_{r+1}\,\cdots\,v_m]'&=P_{\nul{A})}\\
\end{aligned}
\end{matrix}~~\\[30pt]$$

Pseudo-inverse

Definition. $~$For a matrix $A\in\R\nm$, a pseudo-inverse (also Moore-Penrose inverse) is a matrix $A\dag\in\R\mn$ s.t.
\begin{align*}
&AA\dag A=A&&(AA\dag)'=AA\dag\\
&A\dag AA\dag=A\dag&&(A\dag A)'=A\dag A
\end{align*}

Theorem. $~$For any matrix $A$, the pseudo-inverse (if it exists) is unique.
Proof. $~$Suppose $B,C$ are both pseudo-inverses of $A$. Then, we have $${\small BA=B(ACA)=BACA=(BA)'(CA)'\!=A'B'A'C'\!=(ABA)'C'\!=A'C'\!=(CA)'\!=CA}\\{\small AB=(ACA)B=ACAB=(AC)'(AB)'\!=C'A'B'A'\!=C'(ABA)'\!=C'A'\!=(AC)'\!=AC}$$ which imply that $$B=BAB=(CA)B=C(AC)=C$$

Theorem. $~$For any matrix $A$, the pseudo-inverse exists.
Proof. $~$This is trivial using the SVD decomposition of $A$. Let $A=U\Sigma V$, and let $\Sigma\dag$ denote the transpose and element-wise inverse of the nonzero values, i.e. $\Sigma\dag_{ij}=\Sigma_{ji}\inv$ for nonzero values and $0$ otherwise. It's easy to check $\Sigma\dag$ is the pseudo-inverse of $\Sigma$. Then, one can check $A\dag\!:=V\Sigma\dag U'$ is the pseudo-inverse.

HOMEWORK $~$What is the solution to $\min\norm{Ax-b}{}_2^2$ in terms of the Moore-Penrose inverse? (Hint: show it's $A\dag b$)