I was looking at back propagating a gradient through a computational graph, and all makes sense aside from when a node has multiple outputs. Take the following function:
$$f(x) = (x+3)(x+2)$$
Which obviously has the derivative:
$$f'(x) = 2x+5$$
Now, take the following segment of a computational graph:
The numbers above a line represent the value on a forward pass, the numbers beneath it represent the back propagated gradient.
When moving the gradient backwards through the $f(x)$ node, I took the sum of $0.34$ and $-0.2$, then multiplied this sum by the derivative of $f(x)$, with an input of $2$ (taken from the forward pass). So: $(0.34-0.2)\times f'(2) = 1.26$. I understand I had to multiply the $f'(x)$ (following the chain rule), but I do not understand why I had to sum the $0.34$ and $-0.2$. I only did so, because I know that's what I'm meant to do.
Any help is greatly appreciated.
