5

My question follows up with an additional remark from Spivak's proof of Inverse Function Theorem. The problem I have is the statement which immediately follows the If the theorem is true for $λ^{−1}∘f$ , it is clearly true for f... statement (from the link I've posted), in which Spivak assumes "at the outset" that $λ$ is the identity function, i.e. $λ=I$, while $λ$ was clearly defined as $λ=Df(a)$.

How can he even assume this without loss of generality? He's basically limiting himself to functions $f$ such that $Df(a)=I$. Did I get this wrong?

Alex D.
  • 53
  • 1
    A different way to phrase this argument would be to first prove the theorem in the special case that $Df(a) = I$, and then prove the general case as an easy corollary of the special case. – littleO Dec 21 '20 at 15:47
  • This seems sensible to me, and I also considered it. The only problem is that I didn't see (in the proof) a deviation from the particular restriction. – Alex D. Dec 21 '20 at 15:49
  • The way I see the statement is like a "let's see if we can derive some universal property by making this restriction", but I don't see where he drops it... – Alex D. Dec 21 '20 at 15:50
  • 1
    I think Spivak only proves the theorem in the special case that $Df(a) = I$, and (if I remember correctly) doesn't bother to explain why the general case follows from the special case, perhaps because he thinks that part is easy. Calculus on Manifolds is a very concise book. – littleO Dec 21 '20 at 15:55
  • Oh, thank you so much for the answer! I'm not really versed into this subject, so I can't clearly comprehend the easy part about the generalization, nonetheless, this saves me a lot of time. – Alex D. Dec 21 '20 at 16:00

1 Answers1

1

If $\mathcal J Df(a)\neq 0$ then the linear transformation $Df(a):=\lambda:\mathbb R^n\to \mathbb R^n$ is invertible in some neighborhood $U\ni a$. Note that $D\lambda(x)=\lambda$ since $\lambda$ is a linear transformation. The same is true of course, for $\lambda^{-1}.$

Now consider $g:=\lambda^{-1}\circ f.$ We have then by the chain rule,

$Dg(a)=D\lambda^{-1}(f(a))\circ Df(a)=\lambda^{-1}\circ Df(a)=I.$

If the theorem is true for $g$ then $g$ is invertible (in some neighborhood of $a$) and so $f$ is also invertible. Indeed, $g^{-1}=f^{-1}\circ\lambda\Rightarrow f^{-1}=g^{-1}\circ\lambda^{-1}.$

So we may as well assume that $Df(a)=I$ in the first place.

Matematleta
  • 29,139
  • Sorry, I can't just put my finger on your last remark... if $Df(a)=I$, then $f^{-1}=g^{-1}$, which implies that $f=g$. Isn't this circular? We assume the theorem is true for $g$, w.l.o.g. but we find an $f$ and since $f=g$ we assume that the theorem is true for $f$ w.l.o.g. which in turn means we don't have to prove the theorem for $f$? Sorry for the confusion... – Alex D. Dec 21 '20 at 15:44
  • 1
    $f^{-1}$ is not equal to $g^{-1}.$ My answer shows that $if$ the claim is true for $g$ then it $must$ be true also for $f$. So we may as well prove it for $g$. Then you get it for $f$ for free. – Matematleta Dec 21 '20 at 17:39
  • Thank you! So $λ$ is actually $λ=Dg(a)=I$. – Alex D. Dec 21 '20 at 18:28
  • No. $\lambda=Df(a)$, right? Are you confused about the derivatives? The chain rule? Or the fact that the derivative of a linear transformation is itself? – Matematleta Dec 21 '20 at 18:30
  • No. I understand your answer completely. I understand the fact that proving the theorem for $g$ indeed proves it for $f$ since $g$ is a functional composition of two functions and since $g$ abides the conditions of the theorem then so does $f$, therefore $f$ is also subject to it...What I really try to comprehend is why was he so inclined to choose $Df(a)=I$? What I'm trying to say is that you lose generality by doing that. – Alex D. Dec 21 '20 at 18:40
  • 1
    It's because if you can prove the claim for a function whose derivative at $a$ is the identity, then you can prove it for any other function whose derivative is some other (invertible) linear transformation. How bout looking at it like this?: if $f$ is real vaued, then $Df$ just maps $x$ to $ax$ for some $a\neq 0.$ But you can always adjust it so that $a=1$ by taking $g=\frac{1}{a}f$. It works the same way for functions on any $\mathbb R^n.$ – Matematleta Dec 21 '20 at 18:44
  • Saying let $f$ be any function, then let $g=λ^{-1} ∘ f$. We can assume on the outset that $λ^{-1}=I$...well if you do that, then $f$ can't be any function. $f$ needs to be a function such that $Df(a)=I$. – Alex D. Dec 21 '20 at 18:45
  • Oh, so you mean, this is a particular case, and you can easily extend it to the general case? – Alex D. Dec 21 '20 at 18:46
  • 1
    $\lambda^{-1}$ is not $I$. It is the inverse of $Df(a).$ What we're saying is simply that $\lambda^{-1}\circ f$ is a function whose derivative at $a$ is $I$ and if we can prove Spivak's claim for it, then the claim follows for $f$ as well. So now forget $f$ and prove the claim for $\lambda^{-1}\circ f$ and this is what Spivak does. – Matematleta Dec 21 '20 at 18:51