I stumbled onto this question because I had a related question about something else. Your question is old but I believe I my answer can help others with a similar question. There are essentially two types of derivatives in single-variable calculus, and analogously, two types of derivatives in multivariate calculus. Long but stay with me. The length is a cost but gives the benefit of clarity and organization (hopefully). The total derivative comes at the end. I have to go through some other things before I get there.
Case 1: How a function changes by changing the function directly
In single variable calculus, consider the function $f(x)$. How does $f$ change as we directly change the function through the variable $x$? This is the derivative $df(x)/dx$. It determines how $f$ changes for every unit change in the direct or domain variable $x$.
In multivariate calculus, consider the function $f(x,y)$. Again we ask, how does $f$ change as we directly change the function through the variables $x$ and $y$? We can consider how $f$ changes only by changing $x$, which is $\partial f/\partial x$. This gives the rate of change of $f$ per unit change in the $x$ direction. Likewise for the $y$ variable, holding $x$ fixed. But we can generalize this partial derivative along either $x$ or $y$ to any straight line direction. This is called the directional derivative. Again, just to repeat myself, we are asking how $f$ changes directly by changing it's domain variables. Just to be complete, the directional derivative takes the form $\nabla f \cdot \vec{v}$. We can generalize the directional derivative even further. Instead of asking how $f$ changes along a straight line path, we ask how $f$ changes tangent to an arbitrary path in our domain. The only difference between an arbitrary path and a straight line is that the tangent vectors on a arbitrary path change along the path, while the tangent vectors along a straight line do not change along the line. Therefore, the derivative still takes the form $\nabla f \cdot \vec{v}$, but $\vec{v}$ is changing as you move from point to point on your path.
To recap: This section is all about finding how $f$ changes directly as we change it's direct variables (the domain variables). In the multivariate setting, we can ask how $f$ changes per unit change along it's $x$ and $y$ axes, which we generalize to arbitrary straight lines (directional derivative), which we can generalize to arbitrary paths. In either case, it's always how $f$ changes per unit change in its direct or domain variables.
Case 2: How a function changes by changing the function indirectly
In single variable calculus, consider $f(x)$. But what if $f(x) = f(g(t))$? That is, $f$ is a composite function. You can change $f$ directly by changing x, or indirectly by changing $t$. Changing the $t$ parameter knob would consequently change x and consequently change $f$. Therefore, what is $df(g(t))/dt$? We are asking how the function changes indirectly through $t$. The derivative won't have units of $f$ per unit $x$ domain variable. It will have units of $f$ per unit $t$ indirect variable. To be complete, the chain rule gives $df(g(t))/dt = f'(g(t))g'(t)$ = derivative of outside with respect to the inside, times the derivative of the inside. In case 1 above, we only considered how $f$ changed directly with respect to it's domain variable. To be clear, if you write out $f(g(t))$ all you'd see would be $x$'s. Applying $d/dt$ to the function asks how $f$ changes indirectly through parameter $t$.
In the multivariate case, consider $f(x,y)$. Is there such thing as a 'composite' multivariate function? Yes, and it looks like $f(x,y) = f(g(t), h(t))$. This is just the multivariate version of a composite function. Likewise, we have something called the multivariate chain rule. Again, case 1 above asks how $f$ changes directly with it's domain variables. However, we can ask how $f$ changes through $t$ which is outside it's domain. Changing $t$ changes $x,y$ and consequently changes $f$. The multivariate chain rule looks similar to a single variable chain rule. The multivariate chain rule gives $df(g(t),h(t))/dt = \nabla f \cdot \langle g'(t), h'(t) \rangle$, which is just the derivative of the outside times the derivative of the inside and do a sum. This is also called the chain rule for paths, but I prefer to call it the multivariate chain rule. Also notice that this derivative happens to have the same form as a directional derivative and arbitrary path derivative above. The reason being is because straight lines and arbitrary paths in your domain require a parameterization $x = g(t)$ and $y = h(t)$. Then you can apply the correct limit definition of your derivative and see that it's form...(do not get me started on limits and directional derivatives. I think it's poorly taught...limit definitions when evaluating how $f$ changes directly through domain variables or how $f$ changes indirectly through parameters are different. Yet this difference is never brought to light). Nonetheless, realize that case 1 derivatives give change in $f$ per unit change in domain variables, while case 2 derivative gives change in $f$ per unit change in indirect variable(s), which can have different units. These are two types of derivatives that you can tell apart simply by looking at units, or just by determining if $f$ is changing directly or indirectly.
Total Derivative
So which camp does the total derivative fall into: direct or indirect? The answer is both. Consider a 'multivariate composite function' $f(g(t), h(t))$. If I wrote out $f$, you wouldn't see any $t$'s. Say as an example, $f = x^2 + y$ where $x = g(t)$ and $y = h(t)$. Point: there are no $t$'s. Now what if I consider $f(t, g(t), h(t))$? Now you do see $t$'s in the equation of $f$ such as $f = tx^2 + y - t$. So if I asked for the derivative of $f$ with respect to $t$, $\frac{d}{dt} f(t, g(t), h(t))$, will $f$ change directly or indirectly? Both is the answer because as I change $t$ I'm changing both a domain variable or direct variable but I'm also changing a parameter outside the domain which indirectly affects $x$ and $y$. However we can do a 'trick'. Although $f(t, g(t), h(t))$ is not 'completely' composite, it can be if we consider $f(t, g(t), h(t)) = f(t(t), g(t), h(t))$. What I'm doing is letting the direct variable $t$ be determined by an indirect variable $t$. In other words, $t_{direct} = t_{indirect}$. Previously, I had my hands turning knobs on both a direct variable $t$ and an indirect parameter $t$ living outside the domain. Now, with the $t = t$, I have a completely composite function and therefore I can use the multivariate chain rule which gives,
$$\frac{d}{dt}f(t,g(t),h(t)) = \nabla f \cdot \langle \frac{d}{dt}t, g'(t), h'(t) \rangle$$
where since $f = f(t, g(t), h(t))$, I ''redefined'' the gradient to include $\partial f/\partial t$ in the first slot. I'm only doing this to give a new perspective. Or you can do what textbooks and wikipedia does and keep the gradient as is (that is, just contain partials with respect to x,y,z) and pull the $\partial f/\partial t$ out front. Anyways, think about it like this. $t$ is a direct domain variable, but by setting/thinking of $t_{direct} = t_{indirect}$ it's like we have 2 gears perfectly linked. Rotate one gear and the other perfectly rotates in unison. Change $t_{indirect}$ and $t_{direct}$ changes identically. Therefore even though $f$ changes both directly and indirectly, our hands are only on the indirect parameter knob and we have a completely composite function. And we know how to find the derivative now via case 2 above.