12

I am confused about the following.

When you diagonalize a $n\times n$ matrix $A$, you write $A$ as $PDP^{-1}$ with $P$ being orthogonal. Because if $P$ wasn't orthogonal, it wouldn't be invertable.

Then why don't we call this "orthogonal diagonalization"?

When you diagonalize a $n\times n$ symmetric matrix $A$ (so $A = A^T$), you write $A$ as $PDP^T$, because $P^{-1}= P^T$.

But if $P^{-1}= P^T$, doesn't that imply that $P^TP=I$ and thus that P is orthonormal? Then why don't we call this "orthonormal diagonalization"?

Edward Stumperd
  • 299
  • 1
  • 2
  • 6
  • "... if P wasn't orthogonal, it wouldn't be invertible." This is not correct. There're non-orthogonal matrices that're invertible. – Learning Math Jul 16 '20 at 13:26

3 Answers3

9

If $A$ is diagonalizable, we can write $A=S \Lambda S^{-1}$, where $\Lambda$ is diagonal. Note that $S$ need not be orthogonal. Orthogonal means that the inverse is equal to the transpose. A matrix can very well be invertible and still not be orthogonal, but every orthogonal matrix is invertible. Now every symmetric matrix is orthogonally diagonalizable, i.e. there exists orthogonal matrix $O$ such that $A=O \Lambda O^T$. It might help to think of the set of orthogonally diagonalizable matrices as a proper subset of the set of diagonalizable matrices.

Manos
  • 25,833
3

Being diagonalizable does not imply that it can be diagonalized with an orthogonal matrix.

The relevant result is: A matrix is unitarily diagonalizable iff it is normal (ie, $A^* A = A A^*$).

For example, $A = \begin{bmatrix} 1 & 1 \\ 0 & 2 \end{bmatrix}$. It is straightforward to check that $A$ is not normal, has two distinct eigenvalues, and the eigenspaces are $\mathbb{sp} \{ (1,0)^T \}$ ($\lambda=1$) and $\mathbb{sp} \{ (1,1)^T \}$ ($\lambda=2$) respectively.

It is easy to see that the eigenspaces are not orthogonal and that $A$ can be diagonalized by taking any non-zero vector from the two eigenspaces, say $p_1,p_2$, forming the matrix $P = \begin{bmatrix} p_1 & p_2 \end{bmatrix}$.

Then you will have $A P = P \begin{bmatrix} 1 & 0 \\ 0 & 2 \end{bmatrix}$, and $P$ is invertible (but not orthogonal) because $p_1,p_2$ are linearly independent.

Note: Hermitian matrices (or symmetric in the real case) are 'automatically' normal and can always be unitarily (orthogonally) diagonalized.

Note: Any orthogonal $U$ matrix can be 'turned into' an orthonormal matrix $\tilde{U}$ in the following way: Let $\Lambda = U^* U$, then $\Lambda$ is diagonal with positive entries on the diagonal. Hence we can define the square root $\sqrt{\Lambda}$ as the diagonal matrix of corresponding square roots. Then $\tilde{U} = U \sqrt{\Lambda}$ is orthonormal.

copper.hat
  • 172,524
1

A matrix $P$ is called orthogonal if $P^{-1} = P^T$. Thus the first statement is just diagonalization while the one with $PDP^T$ is actually the exact same statement as the first one, but in the second case the matrix $P$ happens to be orthogonal, hence the term "orthogonal diagonalization".

It's all a matter of getting the definitions to coincide. If you wonder why the latter matrix is called orthogonal and not orthonormal, then this is a little deeper question.

Hope that helps,

  • So it is true that the latter matrix will always be orthonormal? – Edward Stumperd Oct 27 '12 at 17:26
  • @Edward : There is no standard definition of an "orthonormal matrix". If you mean that the columns form an orthonormal basis of $\mathbb R^n$, then yes, when $P^{-1} = P^T$. – Patrick Da Silva Oct 27 '12 at 19:33
  • 1
    Yes sorry, that was what I meant. I commented in a hurry and made a bad choice of words. – Edward Stumperd Oct 27 '12 at 19:46
  • Orthogonal would normally mean a scalar product. Is there something similar to orthogonally diagonalize a matrix w.r.t. a given bilinear form M for example $diag(1,-1)$. Then the eigenvectors of a matrix $Ae_i=\lambda_i e_i$ would be orthogonal if $e_i^TMe_j=\delta_{ij}$ ? – QuantumPotatoïd Jul 20 '22 at 19:26
  • @Cretin2: Diagonalizing refers to decomposing a vector over a basis. As long as the bilinear form $M(v,w) = v^{\top} Aw$ (where $A$ is the matrix associated to the bilinear form $M$) induces an inner product via $<v, w> = v^{\top} Aw$, you can orthogonally diagonalize with respect to $M$. Orthogonal diagonalization makes sense in any finite-dimensional Hilbert space.

    But if your matrix $M$ does not induce an inner product (such as $\mathrm{diag}(1,0)$), then you cannot diagonalize because the decomposition stops making sense.

    – Patrick Da Silva Jul 20 '22 at 23:38