5

In section 9.4 (Idempotent Matrices), the book says that :

if $A$ is idempotent, which means that $AA = A$, then

$f(sI + tA) = (I-A)f(s) + Af(s+t)$

but I don't understand the meaning of this formula, can anyone tell me where does it come from or show me how to proof this formula? Thanks.

Git Gud
  • 31,356
  • Don't they say what $f$ is? – Git Gud Mar 21 '14 at 12:46
  • They doesn't mention that. That's what I am confused with too. – Monkey D Bear Mar 21 '14 at 12:49
  • This is very weird. If $f$ takes matrices as inputs, then why would they write $sI$ instead of $s$? And if $s,t$ are actually scalars, then what is $f(sI+tA)$? – Git Gud Mar 21 '14 at 12:53
  • It looks like a Taylor expansion, no? – Chinny84 Mar 21 '14 at 12:55
  • @GitGud - I guess $f$ is a polynomial. If, for example, $f(x)=x^2$ then $f(3)=9$ and for matrices $f(I)=I\cdot I=I$ – Belgi Mar 21 '14 at 12:55
  • @Belgi That interpretation is looking good, despite the abuse of notation. – Git Gud Mar 21 '14 at 12:56
  • @GitGud - I agree there is a problem, for example if $f(x)\equiv c$ we get that the LHS is $c$ while the RHS is $cI$ – Belgi Mar 21 '14 at 12:59
  • Reference: http://orion.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf – Casteels Mar 21 '14 at 12:59
  • 2
    Martin Agerami's nice answer explains everything, including the case $f(x)=c$ (which should be understood as $f(x)=cI$ in the matrix case). The following may make the formula more intuitive: note that $(I−A)A=A−A^2=A−A=0.$ This shows that $I-A$ is idempotent as well, and suggests that decomposing $sI+tA$ as $s(I−A)+(s+t)A$ will be useful. In fact, this observation immediately leads to $$(sI+tA)^n=(s(I−A)+(s+t)A)^n=s^n(I−A)^n+(s+t)^nA^n=s^n(I-A)+(s+t)^nA,$$ for $n>0.$ So the formula holds for all polynomial functions $f.$ – Will Orrick Mar 21 '14 at 17:10

2 Answers2

4

For a diagonalizable matrix $A=SDS^{-1}$ and a function $f$, it is standard to define $f(A)=Sf(D)S^{-1}$, and $f(D) $ is the diagonal matrix with diagonal $(f(D_{jj}) )$. This is coherent with polynomial evaluation, since $(SDS^{-1})^n=SD^nS^{-1} $. It is also coherent with writing continuous functions as limits of polynomials.

Now here $A$ is idempotent, and thus diagonalizable. So the formula certainly makes sense. The question is whether it is true. By the first paragraph, it is enough to check the formula on a diagonal matrix $D$ with its diagonal entries consisting of zeroes and ones.

So the diagonal of $f(sI+tD) $ consists of a number of $f(s) $ (when $D_{jj}=0$) and $f(s+t) $ (when $D_{jj}=1$). In both cases the formula $$ f(s+tD_{jj})=(1-D_{jj})f(s)+D_{jj}f(s+t) $$ holds. So $$ f(sI+tA)=Sf(sI+tD)S^{-1}=S((I-D)f(s)+Df(s+t))S^{-1}\\=(I-A)f(s)+Af(s+t). $$

Ricky
  • 3,148
Martin Argerami
  • 205,756
3

I'm guessing here that $f \in k[x]$, $s,t \in k$ and evaluating $f$ at a matrix $A$ is done as follows:

$$f(A) = \sum_{i=0}^n a_iA^i$$

where $f(x) = \sum_{i=0}^n a_ix^i$ and $A^0 = I$.

I imagine you should be able to prove this fairly easily by induction on the degree of $f$. It certainly very clearly holds when $f$ is linear.

To show it in the case when $f = a_1x +a_0$, first expand the RHS:

\begin{align*}(I-A)f(s) + Af(s+t) &= (I-A)(a_1s+a_0) + A(a_1(s+t)+a_0)\\ &=a_1(sI+tA)+a_0I = f(sI+tA)\end{align*}

Now, for the inductive step, say $f=a_nx^n + \ldots +a_1x +a_0$, we can write $f$ as $f=x\cdot a_nx^{n-1} + g$, where $g = a_{n-1}x^{n-1} + \ldots +a_1x+a_0$ has degree strictly smaller than $f$.

Now apply your inductive step and this result,

\begin{align*}f(sI+tA) &= (sI+tA)a_n(sI+tA)^{n-1} + g(sI+tA)\\ &=(sI+tA)a_n[(I-A)s^{n-1} +A(s+t)^{n-1}] + (I-A)g(s) + Ag(s+t)\end{align*}

Now expand, use that $A$ is idempotent to cancel some terms, and you get $Aa_n(s+t)^n + Ag(s+t) + (I-A)a_ns^n + g(s)$, which is precisely the result if you think how $g$ was defined.

ah11950
  • 2,670
  • 12
  • 19