0

I have been reading this:

https://thenumb.at/Autodiff/

And I am stuck at the Chain rule part.

The definitions:

enter image description here

I have highlighted in read the terms I don't understand below:

enter image description here

If you scroll up just a few lines it says this:

enter image description here

"h" and "f" are exactly the same functions.

They map 2 values to 2 output values. The why the derivative of "f" is a 2 by 1 vector and the derivative of h is a 2 by 2 matrix ?

I assume that $x = (x_{1}, x_{2})$ so it is just a matter of doing the same for f but with $(x_{1}, x_{2})$ instead of $(x, y)$ ?

ATest
  • 13
  • 1
    How can $h,f$ be "exactly the same function"? One of them is a function of two variables, the other is a function of one variable. – lulu Aug 10 '22 at 11:42
  • then why is $f$ defined as a function that takes 2 variables as input ? – ATest Aug 10 '22 at 11:43
  • $f$ isn't defined at all, or at least it isn't defined anywhere in your post. I suggest editing your post to add critical information, such as the domains and codomains of each of your functions and, if there is meant to be some connection between the functions you are thinking of, actually state what it is. – lulu Aug 10 '22 at 11:46
  • Ok added the definitions of the functions – ATest Aug 10 '22 at 12:03
  • The web page is interesting and the presentation is nice, but I think the mathematical part of that page is pretty slapdash. If you are curious about the maths here you should use a better reference. This particular field is called "multivariable calculus", so for example you could try reading "Calculus of several variables" by Serge Lang. I'm sure there are more reference suggestions on this site. – Suzu Hirose Aug 10 '22 at 12:24
  • @SuzuHirose Thanks!! – ATest Aug 10 '22 at 12:41

1 Answers1

4

The website you took this from is just wrong. There isn't even a $g_1$ and a $g_2$ as $g$ maps to $\mathbb{R}$. The correct formula for $h = g \circ f$ is $$J_h(x,y) = J_g(f(x,y))J_f(x,y) = \begin{pmatrix} \frac{\partial g}{\partial x}(f(x,y)) & \frac{\partial g}{\partial y}(f(x,y)) \end{pmatrix}\begin{pmatrix} \frac{\partial f_1}{\partial x}(x,y) & \frac{\partial f_1}{\partial y}(x,y) \\ \frac{\partial f_2}{\partial x}(x,y) & \frac{\partial f_2}{\partial y}(x,y) \end{pmatrix} = \begin{pmatrix} \frac{\partial g}{\partial x}(f(x,y)) \frac{\partial f_1}{\partial x}(x,y) + \frac{\partial g}{\partial y}(f(x,y)) \frac{\partial f_2}{\partial x}(x,y) & \frac{\partial g}{\partial x}(f(x,y)) \frac{\partial f_1}{\partial y}(x,y) + \frac{\partial g}{\partial y}(f(x,y)) \frac{\partial f_2}{\partial y}(x,y)\end{pmatrix},$$ where $J$ denotes the Jacobian. I'd advise to stay away from this site as they also seem to be confusing gradients with Jacobians as well as introducing the bad habit of using $f(x)$ as the name of a function instead of the function value at $x$.

Klaus
  • 10,578
  • I also think, that mentioned site contain errors, but suggest considering $g:\mathbb{R} \to \mathbb{R}^2$ and $f:\mathbb{R}^2 \to \mathbb{R}$, then $g\circ f : \mathbb{R}^2 \to \mathbb{R}^2$ and brought formula can live. – zkutch Aug 10 '22 at 12:25
  • @zkutch: That does not fit with the diagram, where $f$ clearly maps $\mathbb R^2\to\mathbb R^2$ and then $g$ maps down to one dimension. – String Aug 10 '22 at 12:34
  • fuck me man I spent hours on this :( Thanks!! Do you suggest any other resource to learn partial differential equations (for computing if possible ?) @Klaus – ATest Aug 10 '22 at 12:36
  • @zkutch I think what they actually intended was the formula for the gradient, which transforms covariantly, i.e. $\nabla h = J_f^T \nabla g$ but then somehow got confused. – Klaus Aug 10 '22 at 12:42
  • 1
    @ATest There are likely people that can give better advice on that. It would therefore be better to ask a new question, specifying exactly what you need regarding level and accessibility (e.g. book, free website, whatever). – Klaus Aug 10 '22 at 12:45
  • Yes, @String. About diagram you are right, but my suggestion, as I wrote, makes formula correct. – zkutch Aug 10 '22 at 12:55
  • @zkutch It would need to be $f: \mathbb{R} \to \mathbb{R}^2$, $g: \mathbb{R}^2 \to \mathbb{R}^2$ to be correct. And even then, using $\nabla$ makes it confusing. – Klaus Aug 10 '22 at 12:58
  • I may be wrong, @ Klaus, but if it were so, then gradient of $f$ would consist with $f1,f2$ and not of $f_{x_{1}}, f_{x_{2}}$. And what do you dislike about the option I suggested? – zkutch Aug 10 '22 at 13:04
  • @zkutch You are right. What I proposed in the previous comment does not work either. The problem with your suggestion is that $g$ only depends on one variable, so $g_{x_1}$ and $g_{x_2}$ does not make sense. In any case, I don't think there is a way to make sense of what they wrote on that website. – Klaus Aug 10 '22 at 13:10
  • @ATest There is an underlying issue: how could you have anticipated/avoided the Math confusion. The site, which represents a subtle trap, is motivated by Computer programming considerations. So, the site's focus is at least partially split between Math and Computer programming. The tip off would be that after you invest no more than 60 minutes trying to diagnose the site's posted Math, you are then supposed to take a step back and ask yourself : $\color{red}{\text{What's up ?}}$ – user2661923 Aug 10 '22 at 13:10
  • 1
    Yes, @ Klaus, I agree about $g_{x_1}$ - I wanted to look at it as $(g\circ f)_{x_1}$, but, as I see, anyway, we cannot make dead alive. – zkutch Aug 10 '22 at 13:18
  • @user2661923 yes that is true. What sources do you suggest to get a solid understanding ? books ? – ATest Sep 11 '22 at 11:35
  • @ATest Definitely, some Math textbook that contains many, many exercises. As others have commented, the specific topic is multivariable Calculus. Unfortunately, what might be the right textbook for someone else, might not be the right textbook for you. So, you will have to explore, to find the right textbook for you. – user2661923 Sep 11 '22 at 14:53