4

enter image description here

I really am having trouble understanding the statement and the proof. Why does the theorem pick $v_1 \neq 0 $? Why not $v_2$? Also in proving (a), why do we consider the largest $j$?

I do not understand the statement Note all of $a_2, a_3, \dots, a_m$ can be $0$ (because $v_1 = 0 $) Can someone teach me by example?

Lemon
  • 12,664

2 Answers2

5

First, there's nothing special about $v_1$ that we set it not equal to $0$. We're putting an arbitrary order to our list of basis vectors: the theorem would work the same if we switched the places of $v_1$ and $v_2$ (and changed the statement accordingly so that we could potentially express $v_2$ in terms of $v_1$ and require that $v_2 \ne 0$).

We have to require that $v_1 \ne 0$ because otherwise the singleton $\{ v_1 \}$ is linearly dependent, but it is nonsensical to express $v_1$ in terms of previous basis vectors (unless you consider $v_1 = 0$ an example of that).

We consider the largest $j$ with nonzero coefficient simply because we're expressing $v_j$ in terms of previous basis vectors. If the coefficient of $v_k$ is $0$, then the statement says nothing about the vector $v_k$. We want to be able to divide out the coefficient, and taking the largest $j$ vector allows us to do that.

Now to answer your last question, suppose all of $a_2 \dots a_m$ are $0$. We're assuming that $a_1$ and $v_1$ are not zero, so $a_1 v_1 \ne 0$. But $0 = a_1 v_1 + \dots a_m v_m = a_1 v_1$ if the other coefficients are all $0$. This is a contradiction.

Hope that helped.

A.S
  • 9,016
  • 2
  • 28
  • 63
  • Great answer, +1. Just to add to this to tie up the last query, we note that since $a_{1}v_{1} = -a_{2}v_{2} - a_{3}v_{3} - \cdots - a_{m}v_{m}$, and $v_{1} \neq 0$, we can't have all $a_{2}, a_{3}, \ldots, a_{m}$ be $0$ unless $a_{1} = 0$; but our coefficients were assumed to not all be $0$, so this can't be. – Alex Wertheim May 27 '13 at 03:52
  • @AWertheim Thanks. I have added that in. – A.S May 27 '13 at 03:57
  • @sizz ${ 0 }$ is linearly dependent because $1 \cdot 0 = 0$ with nonzero coefficients. (Thanks, @PeteL.Clark) – A.S May 27 '13 at 04:09
  • 1
    @Andrew: that the span of the empty set is ${0}$ is a convention. That ${0}$ is linearly dependent doesn't seem like a convention to me: there is a nontrivial linear combination which is zero, namely $1 \cdot 0 = 0$. I see no choice in the matter... – Pete L. Clark May 27 '13 at 04:12
  • $v_k$ is just a general basis vector. It applies to the cases when $k > j$, but the statement that follows also applies to cases when $k < j$ as long as $a_k = 0$. – A.S May 27 '13 at 04:22
  • @AndrewSalmon, can you explain the "ordering" part a bit? – Lemon Jun 28 '13 at 23:50
  • @sidht Sorry this is so late. What we are doing is we are taking each of the vectors and giving them an order $v_1 \dots v_n$. In this situation, it is helpful to think of $v_1$ and coming before $v_2$, which is, in turn, before $v_3$, and so on. When I said that the order was arbitrary, I meant that the result would still hold if you put $v_3$ first, $v_5$ second, etc., as long as all the vectors in the basis were used in the order. Does this make the answer a bit clearer? – A.S Jul 07 '13 at 18:43
3

First, this is an important and basic result. Mildly rephrased, it says: if you have a finite linearly dependent set of vectors, then by looking from left to right you can always find one vector which lies in the span of the vectors to its left in your list. Compare this with the statement that a finite set of vectors is linearly dependent iff you can always find one vector in the list which is a linear combination of the other vectors. The latter statement is a more immediate consequence of the definition: if you have a nontrivial linear combination $a_1 v_1 + \ldots + a_n v_n = 0$, then by definition we must have $a_i \neq 0$ for at least one $i$, and then

$$v_i = \frac{-a_1}{a_i} v_1 + \ldots + \frac{-a_{i-1}}{a_i}v_{i-1} + \frac{-a_{i+1}}{a_i} v_{i+1} + \ldots + \frac{-a_n}{a_i} v_n.$$

Axler's statement is stronger in that it involves the ordering of the vectors. This is surprisingly useful, and you should look ahead to see applications if you haven't already.

About the $v_1$: it's the first vector on the list, so that's the distinguished role it plays. Note also that I would state the result a bit differently: I would allow $v_1$ to be a linear combination of the previous, empty, list of vectors, which happens iff $v_1 = 0$. But when thinking about linear dependence early on it is probably best to concentrate on the case when none of the vectors are zero, because putting the $0$ vector into the list automatically makes the list linear dependent and is a rather degenerate case.

For your other questions:

We pick the largest $j$ because this definition enables us to write $v_j$ as a linear combination of previous vectors in the list only, which is what we are trying to do.

Not all of the $a_2,\ldots,a_m$ can be zero: since we've assumed that $a_1,\ldots,a_m$ are not all zero, if all but $a_1$ are zero then we must have $a_1 \neq 0$. But then $$0 = a_1 v_1 + a_2 v_2 + \ldots + a_m v_m = a_1 v_1,$$ and since $a_1 \neq 0$, we get $v_1 = 0$, and we've assumed it isn't. (Note that you write "because $v_1 = 0$" where you should write "because $v_1 \neq 0$".)

Pete L. Clark
  • 97,892