4

So I was reading the proof on the shortest distance between two points being a line (https://en.wikipedia.org/wiki/Calculus_of_variations#Example) but one line of the proof is baffling me. The statement is as follows:

"$\frac{\partial L}{\partial f} -\frac{d}{dx} \frac{\partial L}{\partial f'}=0$ with L = $\sqrt{1 + [ f'(x) ]^2}$.
Since f does not appear explicitly in L, we have $\frac{∂ L}{∂ f}=0$"

I don't see how this statement follows, as, in my mind, $f'$ could very well still depend on $f$. For example, if we take $f=x^2$ we get $L=\sqrt{1+4x^2}=\sqrt{1+4f}$, a function that is clearly dependent on $f$. If someone could point out the flaw in my logic/explain why this must always be the case it would be much appreciated!

Sam Spiro
  • 415
  • 1
    This is the most confused notation that is used in this subject. You have to ask yourself what is $\frac{\partial L}{\partial f}$. –  Oct 02 '15 at 03:09

2 Answers2

5

In this case $L \colon \mathbb{R} \times \mathbb{R} \times\mathbb{R} \to \mathbb{R}$ is defined via $L(x,f,g) = \sqrt{1 + g^2}$.

When writing $\frac{\partial L}{\partial f}$ the author is referring to the derivative of $L$ with respect to its second variable, which in this case is $0$ since $f$ does not appear in the expression for $L$. (To put it in a different way, $L$ is constant with respect to the variable $f$)

To expand a little bit more on the notation, $L = \sqrt{1 + f'(x)}$ actually means $$L(x,f(x),f'(x)) = \sqrt{1 + f'(x)}.$$

Giovanni
  • 6,321
  • But shouldn't $f$ and $g$ be independent to do this? aren't $f(x)$ and $f'(x)$ dependent on each other? – Courage Oct 02 '15 at 03:26
  • 1
    @Vishwaas: consider $f(x,y,z) = xz$. What is $\partial_y f(x,y,z)?$ Does this value change if you evaluate $\partial_y f$ at $(x,x^2,x^3)$? – Giovanni Oct 02 '15 at 03:35
  • @Vishwaas Yes, they should be independent, and that's exactly what is achieved in the first line of Giovanni's answer. $L$ is to be considered as a function of three independent variables ($x,f,g$ in Giovanni's notation) in order to compute the partial derivatives in the Euler-Lagrange equation. Only afterward do you evaluate these partial derivatives by substituting $f'$ for $g$. – Andreas Blass Oct 02 '15 at 09:58
  • @Andrea Blass, okay, but can we consider $f$ and $f' $ always as independent? Or is it so when we take partial derivatives.? there is certainly a relation between them. Wont the chain rule apply when differentiating? – Courage Oct 02 '15 at 10:07
  • 1
    @Vishwaas The issue is not whether they are "really" independent or not; rather, the issue is the meaning of the partial derivatives in the Euler-Lagrange equation. Those partial derivatives are to be understood as treating the three variables ($x,f,g$) as independent, regardless of any dependencies that will arise when you later replace $g$ by $f'$. (This is exactly like what Giovanni wrote in his comment above: There $\partial_y$ means to treat $x,y,z$ as independent, regardless of any dependencies that arise when you evaluate at $x,x^2,x^3$.) – Andreas Blass Oct 02 '15 at 11:23
  • @Vishwaas In other words, when taking partial derivatives of $L$, you don't consider $f$ or $f'$ at all! Those are variables you plug in only after taking the partial derivative, so whether they are independent or not doesn't matter when you take the derivative. As you might see and as was said earlier, the notation $\frac{\partial L}{\partial f'}$ is indeed very confusing. – JiK Oct 02 '15 at 12:40
4

I think in order to fully clarify what is going on it is better if we solve this problem from scratch; then it will become clear what is happening. Suppose $y_0(t)$ is the stationary path (the path minimizing $J[y]$) for $$J[y] = \int_{t_1}^{t_0}dt \sqrt{1+(\dot{y})^2}:= \int_{t_1}^{t_0}dt L(t,y,\dot{y}) $$ and $y(t) = y_0(t) +\epsilon\eta(t)$ is a variation such that $\eta(t_0)=\eta(t_1) = 0$. Then assuming $\epsilon$ is small $$ \sqrt{1+(\dot{y})^2} = \sqrt{1+(\dot{y}_0)^2 + 2\epsilon \dot{\eta}\dot{y}_0}= \sqrt{1+(\dot{y}_0)^2}\left(1+\frac{\epsilon \dot{\eta}\dot{y}_0}{1+(\dot{y}_0)^2}\right) $$ Therefore $$\delta J=J[y] - J[y_0] = \epsilon \int_{t_0}^{t_1} dt \frac{\dot{y}_0}{\sqrt{1+\dot{y}_0^2}}\dot{\eta}=-\epsilon \int_{t_0}^{t_1} dt \frac{d}{dt}\left[\frac{\dot{y}_0}{\sqrt{1+\dot{y}_0^2}}\right]\eta $$ The last part is by integration by parts. Now the action ($J[y]$) is stationary if $\delta J = 0$ for any variation $\eta$. This means $$\frac{d}{dt}\left[\frac{\dot{y}_0}{\sqrt{1+\dot{y}_0^2}}\right]=0$$ which as it turns out is actually $\frac{d}{dt}(\partial L/\partial \dot{y})=0$ (The Euler-Lagrange equation with $\partial L/\partial y=0$). If you go back to the proof of Euler-Lagrange equation you will see that at no point we care about how $y$ does actually depend on $t$. That is actually the whole point since we want to explore over all possible paths and find the extremum. In that sense you have to treat $y$ and $\dot{y}$ as independent variables since we actually do not know anything about their functionality and we we don't even want to know.

In other words, in treating the paths as the variables $\dot{y}=\dot{y}_0+\epsilon \dot{\eta}$, while $y=y_0 + \epsilon \eta$. Even though it is quite possible that $y_0$ and $\dot{y}_0$ are related to each other, since $\eta$ is completely arbitrary, $y$ and $\dot{y}$ are independent.

Hamed
  • 6,793
  • Thank you for the insightful comment! Though I am a bit confused on your intermediate steps in the second line of equations. For the first equality, should you not have a (ϵη)^2 term? I hypothesize you used the smallness of ϵ to get around this but I'm not certain; and I'm guessing you also did something like that to get the second equality but that one I don't see at all – Sam Spiro Oct 03 '15 at 14:08
  • Yes I'm assuming $\epsilon$ is small. – Hamed Oct 04 '15 at 01:15