More as a private exercise than to answer this particular question, I've tried to write a proof of the main facts leading to the definition of dimension, using only the basic definitions of linear combination, span, linear dependence, spanning set, and basis; apart from that I assume only some intuitively (and formally) obvious facts as that a subset of a linear independent set is linearly independent. I work in finitely generated vector spaces, but since I don't know the existence of bases at the outset, I cannot limit myself to spaces $\Bbb R^n$.
I know that the Steinitz exchange lemma achieves this, doing everything in one fell swoop, but I think the resulting inductive proof is somewhat technical, and I would like to avoid vagueness such as "up to reordering" and "without loss of generality". I've tried to achieve this by splitting out different levels of detail, which leads to avoiding equations almost altogether. Nonetheless I'm somewhat dissatisfied with my formulation, especially of the final proof which looks harder on paper than it did in my imagination.
One must of course at some point use invertibility of nonzero scalars; to get that out of the way, I start with a simple lemma about linear dependence. Then comes a proposition that constructs bases, explicitly, but without concern for counting elements. The latter is done in the final theorem, which is really just a commentary on the construction in the proposition.
I talk mostly about families of vectors rather than about sets, because everything depends in an essential way on the ordering implied by the indices (and because multiple occurrences of the same vector are not excluded); however I use set builder notation for convenience, which is not really consistent with that point of view. I think that the proposition below can be generalised to the infinite dimensional situation using families of vectors indexed by a well-ordered set, but for the theorem would doubt that it can be done.
For the record, all this requires no commutativity, and works painlessly for finitely generated (say) right modules over a division ring. I would expect this to be true for any proof of these matters, if carefully written.
Lemma. If some linear relation between members of a set $X$ of vectors involves some vector $v\in X$ with nonzero coefficient$~c_v$, then $v$ is a linear combination of the remaining vectors: $\def\Span{\operatorname{Span}}v\in\Span(X-\{v\})$.
Write the relation $\sum_{x\in X}c_xx=0$ as $c_vv=-\sum_{x\in X-\{v\}}c_xx$; now multiply by $c_v^{-1}$ (using that $c_v\neq0$).
Proposition. If $S=\{f_1,\ldots,f_k\}$ is a linearly independent family of vectors in$~V$, and $T=\{g_1,\ldots,g_l\}$ is a family of vectors spanning$~V$, then $V$ has a basis $S\cup C$, where $$C=\{\,g_i\mid\,g_i\notin\Span(S\cup\{g_1,\ldots,g_{i-1}\})\}\subseteq T.$$
For $0\leq i\leq l$, put $X_i=S\cup\{g_1,\ldots,g_i\}$ and $X_i'=S\cup(\{g_1,\ldots,g_i\}\cap C)$; we show by induction on $i$ that the set $X_i'$ is a linearly independent family spanning$~\Span(X_i)$ (in other words a basis of that subspace); the instance $i=l$ gives the proposition since $\Span(X_l)\supseteq\Span(T)=V$. For $i=0$ this holds by the linear independence of $S$. For $i>0$ one distinguishes cases $g_i\notin C$ and $g_i\in C$, both of which are easy. For $g_i\notin C$ one uses that adding to $X_{i-1}$ the vector $g_i$ already in its span does not change that span. For $g_i\in C$ one uses adding the same vector $g_i$ to two sets $X_{i-1},X_{i-1}'$ with the same span keeps their span the same, while $g_i\notin\Span(X_{i-1})=\Span(X_{i-1}')$ ensures that the independent family $X_{i-1}'$ remains independent after adding$~g_i$ (the contrapositive of the lemma says that any linear relation involves only $X_{i-1}'$, and so is trivial).
Now, for any independent family$~S$ and spanning family$~T$, denote by $C(S,T)$ the set $C$ of the proposition, and by $E(S,T)$ the basis $S\cup C(S,T)=S\cup C$ of the proposition. We get in particular, for any space admitting a finite spanning set$~T$ (finitely generated space), a basis $E(\emptyset,T)$ of$~V$. On the other hand for any basis $B$ of$~V$ one has $E(B,T)=B$, since $C=\emptyset$ in the proposition. We can now show that dimension is well defined in the case of finitely generated spaces.
Theorem. For any $S,T$ as above, the basis $E(S,T)$ has the same number of elements as $E(\emptyset,T)$.
Proof is by induction on the size of$~S$, the case $S=\emptyset$ being trivial. So consider $S=\{f_1,\ldots,f_k\}$ with $k>0$, and assuming the result holds for $S^-=\{f_1,\ldots,f_{k-1}\}$. From the definition it is clear that $C(S,T)\subseteq C(S^-,T)$, and we must show that the complement $C(S^-,T)\setminus C(S,T)$ contains exactly one element. Since $E(S^-,T)$ is a basis of$~V$, there is a unique expression of $f_k$ as linear combination$~L$ of its elements. That combination involves some element of$~C(S^-,T)$ with nonzero coefficient, because the family$~S$ is linearly independent. If $i$ is maximal such that $g_i\in C(S^-,T)$ appears with nonzero coefficient in$~L$, then by the lemma $g_i\in\Span(f_1,\ldots,f_k,g_1,\ldots,g_{i-1})$, and therefore $g_i\in C(S^-,T)\setminus C(S,T)$. Conversely if $g_j\in C(S^-,T)\setminus C(S,T)$ then $g_j\in\Span(\{f_1,\ldots,f_k,g_1,\ldots,g_{j-1}\}\cap E(S,T))$ and the linear combination witnessing this must involve $f_k$ with nonzero coefficient (because of $g_j\in C(S^-,T)$); by the lemma and the uniqueness of the expression$~L$ for $f_k$ this forces $j=i$, completing the proof.