3

BACKGROUND: I have recently found (probably well known, but I had never seen this before) that a matrix can be written as a linear combination of the outer products of its eigenvectors where the coefficients are the corresponding eigenvalues. Specifically, let $\boldsymbol{u}_k=(u_{1,k},\dots,u_{K,k})^T$ be the $k$-th eigenvector and $\lambda_k$ the $k$-th eigenvalue of a symmetric matrix $M$ with $k\in [1,K]$. Then $M$ can be written as

$$M=\sum_{k=1}^K \lambda_k \boldsymbol{u}_k \boldsymbol{u}_k^T \tag{1} \label{eq1}$$

Which, is just another way of writing the standard eigendecomposition

$$M=U \Lambda U^{-1} \tag{2} \label{eq2}$$

where $U$ is the matrix whose $k$-th column is $\boldsymbol{u}_k$ and $\Lambda$ is the diagonal matrix whose element $\Lambda_{kk}=\lambda_k$. What is interesting to me is that the series expression is reminiscent of Fourier series expansions with $\lambda_k$ being analogous to the Fourier coefficient and the matrix $\boldsymbol{u}_k \boldsymbol{u}_k^T$ being analogous to a basis function.

HERE'S THE QUESTIONS:

(1) Can someone point me to a good reference (or references) about expressing matrices as a series expansion of "basis" matrices?

(2) Can two different matrices, say $A$ and $B$ be expressed as a series expansion using the same set of basis matrices and just different coefficients? (Eq. \ref{eq1} above could be used for both $A$ and $B$, but the set of basis matrices $\boldsymbol{a}_k \boldsymbol{a}_k^T$ and $\boldsymbol{b}_k \boldsymbol{b}_k^T$ would be different, and I would like to instead write them in the same basis). If so, under what conditions?

okj
  • 2,499

3 Answers3

3
  • Your first question: I would advise the different books named "Linear Algebra" with different subtitles of Gilbert Strang (MIT Press).

  • Your second question; yes, there is such a theorem that you will find in A property of positive definite matrices

Edit: As underlined by @icurays1, it is in the SVD framework

$$M=USV^T=\sum_{k=1}^K \sigma_k \boldsymbol{u}_k \boldsymbol{v}_k^T $$

where $K=min(n,p)$ (if $M \in \mathbb{R}^{n \times p}$), with $\sigma_k \geq 0$ the singular values that this kind of decomposition (as a cumulative sum of the $N$ first "components" closer and closer to the initial matrix $M$ as $N$ increses) is important. Let us take the following notation:

$$M_N:=\sum_{k=1}^N \sigma_k \boldsymbol{u}_k \boldsymbol{v}_k^T $$

(we assume that the singular values are ranked by decreasing amplitude).

with the fundamental property that $M_N$ is, among all rank-$N$ approximations of $M$, the closest to $M$ in the sense of Frobenius norm.

I will take two examples of applications:

  • Application 1: Image compression (even if it is not efficient as jpeg compression!). Consider the image in 3 parts belowenter image description here

Left: The original image; Center: the reconstructed image taking only the $N=10$ first components ; Right: the same as the center image but with $N=30$ first components.

One can see that with 30 components, the reconstruction is almost perfect. On the point of view of the compression ratio, as it is a $629 \times 590$ image, instead of sending $629 \times 590 \approx 371 K$ real numbers (grey values of pixels) for the whole image, we have only to send $(629 + 590) \times 30 \approx 36.6 K$ reals, which results in a 1:10 ratio compression.

  • Application 2: in probability, the very useful PCA (Principal Component Analysis) studies the gap between a joint probability distribution and the closest product of independent 1D probability distributions and iterates on the remaining. Here is a toy-example:

$$\begin{bmatrix}0.02&0.08&0.02\\0.06&0.24&0.08\\0.06&0.28&0.04\\ 0.02&0.10&0.00\end{bmatrix}\approx 0.4089\begin{bmatrix}0.2074\\0.6321\\0.7056\\ 0.2442\end{bmatrix}\times\begin{bmatrix}0.2184&0.9545&0.2029\end{bmatrix} \ \ (*)$$

with a very good agreement between the LHS and the RHS because of a quasi independence between the $X$ and $Y$ components; this explains that the joint distribution $f_{X,Y}$ can almost be "reconstructed" as the product $f_X \times f_Y$ of its marginal distributions. An interesting thing is that in (*), column vector $\boldsymbol{u}_1$ and row vector $\boldsymbol{v}_1^T$ contain, (up to a renormalization, because the sum of their components have to be equal to 1) the "closest distributions" $\phi_X$ and $\phi_Y$ such that $f_{X,Y}\approx\phi_X\times \phi_Y$.

Note: the very good agreement between matrix $M$ and its approximation $M_1$ is reflected in the rapid decay of singular values : $\sigma_1=0.4089, \sigma_2=0.0404, \sigma_3=0.0015$.

Jean Marie
  • 81,803
  • Thanks for catching that, I had symmetric matrices in mind. I have edited the question accordingly. – okj May 04 '16 at 22:36
  • OK. Then I edit accordingly my text. – Jean Marie May 04 '16 at 22:41
  • I have added an important "Edit" to my initial answer – Jean Marie May 05 '16 at 10:14
  • This is a great answer and the examples are VERY helpful. However, the question of how two different matrices $A$ and $B$ can be written using this expansion with the same basis vectors $\boldsymbol{u}k$ and $\boldsymbol{v}_k$ does not seem to be addressed, or perhaps I am missing something? The examples highlight the utility of this kind of series expansion, but can you explain how (or whether it is possible) to write $A_N = \sum{k=1}^{N} a_k \boldsymbol{u}k \boldsymbol{v}_k^T$ and $B_N = \sum{k=1}^{N} b_k \boldsymbol{u}_k \boldsymbol{v}_k^T$ using the same basis? – okj May 05 '16 at 19:27
  • You are right to ask this question: I consider my edit as a better answer to your question (1) than a simple reference to books: i.e., if you have to expand a matrix into a linear combination of rank-one matrices, it is the singular value decomposition (SVD) that should be considered, preferably to the decomposition coming from the ordinary eigendecomposition. Moreover, the singular values (ore more exactly their squares) can be considered as holding a certain part of the "energy" of the matrix, exactly the same vocabulary that is used in Fourier transforms by people doing signal processing. – Jean Marie May 05 '16 at 19:44
  • (ctd) In reference to the end of your comment, I don't know if there is a similar theorem as you express it for SVD as for EVD (ordinary Eigenvalues decomposition) (with two basis instead of one). I will try to see if there is something similar... – Jean Marie May 05 '16 at 19:49
  • That last part is exactly what I meant to address with my answer: As it's so generic it includes as answer what is the very same structure of vector field for matrices.. – MASL May 06 '16 at 03:02
1

"Basis expansions" for matrices - that is, writing a matrix as a linear combination of other matrices like $A = \sum c_jB_j$ is typically less useful than looking for a way to write a matrix as a product of matrices like $A = BC$. The latter is done all the time in numerical linear algebra - see for instance the classic book Matrix Computations. The reason the eigenvalue decomposition (and more generally the SVD) is useful is that it expresses a matrix as a sum of rank 1 matrices, which are particularly trivial to understand.

As for your second question, you're looking for the concept of simultaneous diagonalization. Basically, the answer is yes - you can expand two (diagonalizable) matrices $A$ and $B$ in the same set of eigenvectors if and only if $A$ and $B$ commute, that is $AB = BA$.

icurays1
  • 17,161
1

The vectorial space of matrices $n\times m$ has a canonical basis given by all matrices $n\times m$ of the type $$(e_{ij}) = 1 \;\mbox{at (ij) and 0 the rest.} $$ Any nxm matrix $M$ can then be written as a linear combination of these basis matrices $$M = \sum_{ij}\,M_{ij}\,e_{ij}.$$

On the other hand it is $e_{ij}={\mathbf e}_i\otimes {\mathbf e}_j \equiv {\mathbf e}_i\,{\mathbf e}_j^T$, where ${\mathbf e}_i$ is the unit vector $i$. This is a general linear combination valid for al such matrices. Each will be given by its coefficients $M_{ij}$.

MASL
  • 918
  • I don't see very well the connection with the initial question – Jean Marie May 04 '16 at 22:49
  • Well, I just wrote any matrix as a linear combination of basis matrices, such that any two matrices M and N will differ only on the coefficients of the expansion. That fits what the OP was looking for. – MASL May 04 '16 at 23:33