6

I am curious to know whether there is a way to prove that the maximum of the dot product occurs when two vectors are parallel to each other using derivatives. In particular, given:

$c = \textbf{a}\cdot\textbf{b}$ with $\textbf{a},\textbf{b}\in\mathbb{R}^3$

Assuming that $\textbf{b}$ is fixed, and I can only change $\textbf{a} = (x,y,z)^T$, how would one go about it? I would not even know how to properly take the derivatives in this case.

This is not homework and because of that the question as I put it might be ill-posed, I apologise in advance.

maruko
  • 373

3 Answers3

5

Firstly, it should be noted that, since the dot product is linear and not constantly zero (for non-zero $b$), it has no maximum. However, it does if we fix it to a sphere, and then it represents how large the dot-product can be without making $a$ larger. So, we optimize

Where are the extremes of $a\cdot b$ where $b$ is fixed and $a$ is on the unit sphere.

This can be done without reference to coordinate systems, which is always a plus. In particular, we can define $$f(a)=a\cdot b.$$ We define a sort of directional derivative on this- in particular, define the function $$a(t)=a + ta'$$ which we will use to consider an infinitesimal change in $a$ - that is, we are reducing this to a single variable problem where $a$ is changing with time. Basically, we want vectors $a$ such that $f(a(t))$ has a critical point at $0$, regardless of choice of $a'$ - this represents that, no matter how we perturb $a$ by another vector $a'$, the value of $f$ will be locally optimal.

However, since $a$ is supposed to lie on a sphere, we can't simply let $a'$ vary in any direction. Rather, it must travel on a tangent to the sphere, since otherwise we are increasing or decreasing the magnitude of $a$ to change its dot-product with $b$, which is cheating. This means, owing to the nature of spheres, that $a\cdot a'=0$. So, the question is, how can we make $t=0$ a critical point of $$f(a(t))=a\cdot b + ta'\cdot b$$ be a critical point for every $a'$?

Well, the derivative of the above with respect to $t$ is $$a'\cdot b$$ and we wish for this to be 0 for a critical point. Thus, $a'$ and $b$ must be perpendicular. So, we can ask, simply, "For what vectors $a$ does it hold that, for any $a'$ perpendicular to $a$, the vector $a'$ is also perpendicular to $b$?" - Or, in algebraic terms, "For what vectors $a$ does it hold that any $a\cdot a' = 0\Rightarrow b\cdot a' = 0$?"

Intuitively, we know that this can only hold if $a$ and $b$ are parallel, meaning that the only critical points of the function happen when $a$ and $b$ are parallel. If we want to do this algebraically, note that if $a$ and $b$ are parallel, then if $a'\cdot a=0$, so must $a'\cdot b = 0$, since $a$ and $b$ are multiples of one another. Otherwise, we could choose any tangent vector on the plane spanned by $a$ and $b$; explicitly, we could choose the component of $b$ perpendicular to $a$: $$a'=b-\frac{a\cdot b}{a\cdot a}a.$$ Clearly, $a\cdot a'=a\cdot b-\frac{a\cdot b}{a\cdot a}a\cdot a = 0$. However, we also need that $a'\cdot b=b\cdot b - \frac{(a\cdot b)^2}{a\cdot a}=0$ if $a$ is a critical point, which holds only when $$(a\cdot a)(b\cdot b)=(a\cdot b)^2$$ and noting that $b=ca+a'$ for scalar $c=\frac{a\cdot b}{a\cdot a}$, we can substitute: $$(a\cdot a)((ca+a')\cdot (ca+a'))=(a\cdot (ca+a'))^2$$ and expanding, noting that $a\cdot a'=0$: $$c^2(a\cdot a)^2 + (a\cdot a)(a'\cdot a')=c^2(a\cdot a)^2$$ and canceling terms and dividing out $(a\cdot a)$, which is non-zero, yields $$a'\cdot a' = 0$$ and because the dot product is positive-definite, this means $$a'=\overrightarrow{0}.$$ This only happens when $a$ and $b$ are parallel, since, in this case, $b=ca+a'=ca$. In every other case, therefore, there is some choice of $a'$ with non-zero derivative. So, thus, we verify that only parallel vectors $a$ to $b$ are critical points.

Milo Brandt
  • 60,888
  • This is a really cool argument, thank you very much. In the same way I guess one could prove that for $a \cdot b$ to have a minimum, then $a′$ and $b$ must be parallel. It is a bit cyclical maybe since you need the notion that $a′ \cdot b$=0 for perpendicular vectors here, and $a′ \cdot b$=1 in the parallel case, but I really like the reasoning. One question though, why did you say in the beginning that the dot product is non-zero for non-zero b? Wouldn't it be zero for perpendicular vectors? – maruko Sep 26 '14 at 05:42
  • 1
    I meant that if b isn't 0 then there is an a so a.b is non-zero. I edited my answer to clarify. I also added a purely algebraic proof of this, which follows only from linearity and positive definiteness (a.a > 0 for non-zero a), without any geometric interpretation. – Milo Brandt Sep 26 '14 at 15:41
3

The alternate definition of dot product is $\textbf{a}\cdot\textbf b = ||\textbf a||\, ||\textbf b||\, \cos \theta$, where $\theta$ is the (smaller) angle between two vectors $\textbf a$ and $\textbf b$.

Now, $||\textbf a||,\, ||\textbf b||$ are constants, so maximum of dot product occurs when $\cos \theta$ is maximum, conditioned to $0 \le \theta \le \pi$.

Now, you can prove that maximum value of $\cos \theta$ occurs at $\theta = 0$ using simple trigonometric argument or using derivatives and maximum.

taninamdar
  • 2,618
  • 14
  • 24
1

Given: Constant vector $\textbf b$, and a vector $\textbf a$ whose length ||$\textbf a$|| is constant but whose direction $\hat{\textbf a}$ (a unit vector) is variable.

Goal: Find vector $\textbf a$ which maximizes the dot product ($\textbf{a}\cdot\textbf{b}$).

Consider the following, involving the unit vectors $\hat{\textbf a} = \textbf a / ||\textbf a||$ and $\hat{\textbf b} = \textbf b / ||\textbf b||$:

$|| \hat{\textbf a} - \hat{\textbf b}||^2$ = $(\hat{\textbf a} - \hat{\textbf b}) \cdot (\hat{\textbf a} - \hat{\textbf b})$

= $\hat{\textbf a}\cdot\hat{\textbf a} - \hat{\textbf a}\cdot\hat{\textbf b} - \hat{\textbf b}\cdot\hat{\textbf a} + \hat{\textbf b}\cdot\hat{\textbf b}$

= $\hat{\textbf a}\cdot\hat{\textbf a} - 2 \hat{\textbf a}\cdot\hat{\textbf b} + \hat{\textbf b}\cdot\hat{\textbf b}$

= $\hat{\textbf a}\cdot\hat{\textbf a} + \hat{\textbf b}\cdot\hat{\textbf b} - 2 \hat{\textbf a}\cdot\hat{\textbf b}$

= $1 + 1 - 2 \hat{\textbf a}\cdot\hat{\textbf b}$

= $2 - 2 \hat{\textbf a}\cdot\hat{\textbf b}$

So:

$\hat{\textbf a}\cdot\hat{\textbf b}$ = $1 - \frac{1}{2} || \hat{\textbf a} - \hat{\textbf b}||^2$

Multiplying both sides by $(|| \textbf a || ||\textbf b ||)$:

$(|| \textbf a || ||\textbf b ||) \hat{\textbf a}\cdot\hat{\textbf b}$ = $(|| \textbf a || ||\textbf b ||)(1 - \frac{1}{2} || \hat{\textbf a} - \hat{\textbf b}||^2)$

$\textbf a\cdot\textbf b$ = $(|| \textbf a || ||\textbf b ||)(1 - \frac{1}{2} || \hat{\textbf a} - \hat{\textbf b}||^2)$

Note that $\textbf b$ (and hence $||\textbf b||$ and $\hat{\textbf b}$) and $||\textbf a||$ are constant. Thus, $\textbf a\cdot\textbf b$ is maximized when $|| \hat{\textbf a} - \hat{\textbf b}||^2$ is minimized. The value $\hat{\textbf a} = \hat{\textbf b}$ is allowed, and $|| \textbf 0 ||^2$ is the smallest value that can be attained by $|| \textbf v ||^2$ for vector $\textbf v$. Therefore, $\textbf a\cdot\textbf b$ is maximized when $\textbf{a} = ||\textbf{a}|| \hat{\textbf{b}}$ (i.e., when the direction of $\textbf{a}$ is the same as the direction of $\textbf{b}$).

Colin
  • 11