11

The way I understand it, the total differential and the directional derivative are both linear approximations of the change in a function at a certain point.

So if I know the change in $x$ and $y$ from the initial point, then I plug those into the total differential to find the approximate change in $z$.

But isn't this the same as finding the directional derivative in the direction of

$$ v = (\text{change in } x, \text{change in } y)? $$

user251257
  • 9,229
Fgilan
  • 393

4 Answers4

9

I stumbled onto this question because I had a related question about something else. Your question is old but I believe I my answer can help others with a similar question. There are essentially two types of derivatives in single-variable calculus, and analogously, two types of derivatives in multivariate calculus. Long but stay with me. The length is a cost but gives the benefit of clarity and organization (hopefully). The total derivative comes at the end. I have to go through some other things before I get there.

Case 1: How a function changes by changing the function directly

In single variable calculus, consider the function $f(x)$. How does $f$ change as we directly change the function through the variable $x$? This is the derivative $df(x)/dx$. It determines how $f$ changes for every unit change in the direct or domain variable $x$.

In multivariate calculus, consider the function $f(x,y)$. Again we ask, how does $f$ change as we directly change the function through the variables $x$ and $y$? We can consider how $f$ changes only by changing $x$, which is $\partial f/\partial x$. This gives the rate of change of $f$ per unit change in the $x$ direction. Likewise for the $y$ variable, holding $x$ fixed. But we can generalize this partial derivative along either $x$ or $y$ to any straight line direction. This is called the directional derivative. Again, just to repeat myself, we are asking how $f$ changes directly by changing it's domain variables. Just to be complete, the directional derivative takes the form $\nabla f \cdot \vec{v}$. We can generalize the directional derivative even further. Instead of asking how $f$ changes along a straight line path, we ask how $f$ changes tangent to an arbitrary path in our domain. The only difference between an arbitrary path and a straight line is that the tangent vectors on a arbitrary path change along the path, while the tangent vectors along a straight line do not change along the line. Therefore, the derivative still takes the form $\nabla f \cdot \vec{v}$, but $\vec{v}$ is changing as you move from point to point on your path.

To recap: This section is all about finding how $f$ changes directly as we change it's direct variables (the domain variables). In the multivariate setting, we can ask how $f$ changes per unit change along it's $x$ and $y$ axes, which we generalize to arbitrary straight lines (directional derivative), which we can generalize to arbitrary paths. In either case, it's always how $f$ changes per unit change in its direct or domain variables.

Case 2: How a function changes by changing the function indirectly

In single variable calculus, consider $f(x)$. But what if $f(x) = f(g(t))$? That is, $f$ is a composite function. You can change $f$ directly by changing x, or indirectly by changing $t$. Changing the $t$ parameter knob would consequently change x and consequently change $f$. Therefore, what is $df(g(t))/dt$? We are asking how the function changes indirectly through $t$. The derivative won't have units of $f$ per unit $x$ domain variable. It will have units of $f$ per unit $t$ indirect variable. To be complete, the chain rule gives $df(g(t))/dt = f'(g(t))g'(t)$ = derivative of outside with respect to the inside, times the derivative of the inside. In case 1 above, we only considered how $f$ changed directly with respect to it's domain variable. To be clear, if you write out $f(g(t))$ all you'd see would be $x$'s. Applying $d/dt$ to the function asks how $f$ changes indirectly through parameter $t$.

In the multivariate case, consider $f(x,y)$. Is there such thing as a 'composite' multivariate function? Yes, and it looks like $f(x,y) = f(g(t), h(t))$. This is just the multivariate version of a composite function. Likewise, we have something called the multivariate chain rule. Again, case 1 above asks how $f$ changes directly with it's domain variables. However, we can ask how $f$ changes through $t$ which is outside it's domain. Changing $t$ changes $x,y$ and consequently changes $f$. The multivariate chain rule looks similar to a single variable chain rule. The multivariate chain rule gives $df(g(t),h(t))/dt = \nabla f \cdot \langle g'(t), h'(t) \rangle$, which is just the derivative of the outside times the derivative of the inside and do a sum. This is also called the chain rule for paths, but I prefer to call it the multivariate chain rule. Also notice that this derivative happens to have the same form as a directional derivative and arbitrary path derivative above. The reason being is because straight lines and arbitrary paths in your domain require a parameterization $x = g(t)$ and $y = h(t)$. Then you can apply the correct limit definition of your derivative and see that it's form...(do not get me started on limits and directional derivatives. I think it's poorly taught...limit definitions when evaluating how $f$ changes directly through domain variables or how $f$ changes indirectly through parameters are different. Yet this difference is never brought to light). Nonetheless, realize that case 1 derivatives give change in $f$ per unit change in domain variables, while case 2 derivative gives change in $f$ per unit change in indirect variable(s), which can have different units. These are two types of derivatives that you can tell apart simply by looking at units, or just by determining if $f$ is changing directly or indirectly.

Total Derivative

So which camp does the total derivative fall into: direct or indirect? The answer is both. Consider a 'multivariate composite function' $f(g(t), h(t))$. If I wrote out $f$, you wouldn't see any $t$'s. Say as an example, $f = x^2 + y$ where $x = g(t)$ and $y = h(t)$. Point: there are no $t$'s. Now what if I consider $f(t, g(t), h(t))$? Now you do see $t$'s in the equation of $f$ such as $f = tx^2 + y - t$. So if I asked for the derivative of $f$ with respect to $t$, $\frac{d}{dt} f(t, g(t), h(t))$, will $f$ change directly or indirectly? Both is the answer because as I change $t$ I'm changing both a domain variable or direct variable but I'm also changing a parameter outside the domain which indirectly affects $x$ and $y$. However we can do a 'trick'. Although $f(t, g(t), h(t))$ is not 'completely' composite, it can be if we consider $f(t, g(t), h(t)) = f(t(t), g(t), h(t))$. What I'm doing is letting the direct variable $t$ be determined by an indirect variable $t$. In other words, $t_{direct} = t_{indirect}$. Previously, I had my hands turning knobs on both a direct variable $t$ and an indirect parameter $t$ living outside the domain. Now, with the $t = t$, I have a completely composite function and therefore I can use the multivariate chain rule which gives,

$$\frac{d}{dt}f(t,g(t),h(t)) = \nabla f \cdot \langle \frac{d}{dt}t, g'(t), h'(t) \rangle$$

where since $f = f(t, g(t), h(t))$, I ''redefined'' the gradient to include $\partial f/\partial t$ in the first slot. I'm only doing this to give a new perspective. Or you can do what textbooks and wikipedia does and keep the gradient as is (that is, just contain partials with respect to x,y,z) and pull the $\partial f/\partial t$ out front. Anyways, think about it like this. $t$ is a direct domain variable, but by setting/thinking of $t_{direct} = t_{indirect}$ it's like we have 2 gears perfectly linked. Rotate one gear and the other perfectly rotates in unison. Change $t_{indirect}$ and $t_{direct}$ changes identically. Therefore even though $f$ changes both directly and indirectly, our hands are only on the indirect parameter knob and we have a completely composite function. And we know how to find the derivative now via case 2 above.

DWade64
  • 1,308
  • I think you are thinking about this too philosophically. The idea was to just let you know that one is a linear map and the other is a scalar. The two answer these questions respectively, total derivative: what is the immediate direction $f(p)$ will go given $p$ moves in some direction $u$, directional derivative: what is rate at which $f$ is changing in the direction of $u$? The latter can only be answered if $f$ is $\mathbb{R}$-valued. This derivative is encoded in the total derivative, approximating each $f^j$ where $f = (f^1,...,f^n)$, allowing $f: \mathbb{R}^m \to \mathbb{R}^n$. – Faraad Armwood Aug 12 '17 at 15:24
  • 1
    $Df(p) = (\nabla f^1(p) \cdots \nabla f^j(p))^t$ – Faraad Armwood Aug 12 '17 at 15:26
  • @FaraadArmwood You're probably right that I'm thinking too hard. Thanks for your answer on my other question as well. For this question at least, I think linear maps are too confusing. I think breaking down a derivative between direct and indirect changes is useful as there are analogies in 1D calc. Also, the units between a direct change derivative and an indirect one could be completely different. The wikipedia page seems to say that Total directives are a combination of direct and indirect changes, so I tried to play out that distinction. I think it works well and it's simple. – DWade64 Aug 12 '17 at 15:37
  • @FaraadArmwood Actually I didn't quite answer his question, but just made a general point that could help him. Also I only had 1 course of linear algebra so I'm not that well versed in tensor notation stuff/ or mapping between spaces. And just as a side, I think all these derivatives (total, or directional, or whatever) are scalars. – DWade64 Aug 12 '17 at 15:39
  • 1
    Start reading some linear algebra. The later parts of multivariable calculus i.e once we start talking about gradients, I refer to as differential geometry. Here these use linear objects to study surfaces i.e planes, spheres, paraboloids, ellipsoids etc. The linear object we use to study these things is the tangent space. This tangent space is just the image of some total differential $D\psi(q)$. A great book is Shifrin's multivariable calculus or Pressley's differential geometry. – Faraad Armwood Aug 12 '17 at 15:52
  • 3
    The total derivative is only a scaler when $f: \mathbb{R}^m \to \mathbb{R}$, otherwise it is a linear map. – Faraad Armwood Aug 12 '17 at 15:53
  • @FaraadArmwood Thank you for the book recommendation. I'd definitely like to further my studies one day – DWade64 Aug 12 '17 at 16:09
5

No, no and no: they are very different things. The derivative (also called differential) is the best linear approximation at a point. The directional derivative is a one-dimensional object that describes the "infinitesimal" variation of a function at a point only along a prescribed direction. I will not write down the definitions here.

So to speak, the directional derivative gives you information about the local behavior of a function restricted to a straight line. The derivative gives you information about the local behavior of a function in a whole neighborhood of some point.

There are classical theorems describing the interplay between the two objects. In particular, a differentiable function possesses all the directional derivatives (which you compute by applying the derivative to the directional vector). On the contrary, a function can possess all the directional derivatives, but nevertheless it need not be differentiable.

In your case it seems to me that you are applying the first result: the directional derivative of a function $z=z(x,y)$ along a vector $\vec{v}$ is simply $$ \frac{\partial z}{\partial \vec{v}} = \nabla z \cdot \vec{v}, $$ where $\nabla z$ is the gradient, i.e. the vector that represents the (total) derivative.

Siminore
  • 35,136
  • 2
    It's confusing to say the derivative is the same as the differential. My Finney text at least says a differential is just 'dx' or 'dy' by itself, distinct from the derivative. Also the "total derivative" and "total differential" have different definitions according to the Wikipedia page on the former. – Joseph Garvin Feb 04 '18 at 20:35
  • Does the vector $\vec{v}$ need to be a unit vector? – john Aug 12 '20 at 06:30
2

Total differential

Let's say you have a function $z=f(x,y)$, the total derivative is defined as: $$\Delta z=\frac{\partial f}{\partial x}.\Delta x + \frac{\partial f}{\partial y}\Delta y$$ In words: for an increase of $x$, in point $x_O$ with $\Delta x$, and an increase of y, in point $y_O$ with $\Delta y$, the total differential represents the increase of the value of your function $f(x,y)$.

For the directional derivative, you'll have to understand a gradient of a function. The gradient of a function, is a vector that points in the direction where the increase per unit-of-distance is at it's maximum.

Gradient

The gradient of a function $f(x,y)$, in points $(x_0,y_0)$, is a vector defined as: $$grad(f) = \overrightarrow{\nabla f} = \frac{\partial f}{\partial x}.\overrightarrow{e_x} + \frac{\partial f}{\partial y}\overrightarrow{e_y}$$ where $e_i$ denotes the i-th unit vector if standard basis.

The directional derivative

The directional derivative can be defined as the increase of $f$, per unit of distance, in the direction, defined by $\alpha$.

$$\frac{df}{ds}=|\overrightarrow{\nabla f}|.cos(\alpha)$$

gdm
  • 137
1

One way you can look at it is $<\partial f/\partial x, \partial f/\partial y>$ as the direction of maximum change of your function. If you take any other direction the change would be less. $f$ is a differentiable function here.

Total differential and direction derivative is bit different. If you have scalar function, and you take total derivative in strict sense, it is a scalar value, whereas directional derivative involves vectors.

Let's give an example:

say $f(x) = xy$

The total derivative is $df=\frac{\partial f}{\partial x} dx+ \frac {\partial f}{\partial y}dy = ydx+xdy$

Whereas the direction derivative is defines as in some arbitrary direction $\vec n$, as $<\partial f/\partial x, \partial f/\partial y> \cdot \vec n$, I am assuming $x,y$ plane here.