Why does changing variables work?

Question

I am slightly ashamed to be asking this, but I have been recently reflecting on changing variables in very simple problems. If I missed a question that already discusses this please point it out to me and I will delete this one. Anyhow writing this will probably be a learning experience.

Directly from the Wikipedia page on the argument I take as an example the equation:

$$x^6 - 9 x^3 + 8 = 0. \, $$

I quickly recognize this as a high school problem and use the methods that were taught to me, namely I set $x^3 = u$ so $x = u^{1/3}$.

Then I proceed to solve quadratic equation that results from this substitution, and only at the end I apply the reverse transformation $x^3 = u$ to get an answer for my starting variable. With not much imagination I always thought that the function used when changing variables (in the above case $f(x) = x^3)$ should be bijective in the domain of interest of the starting equation. This is because I need the inverse to return to my "starting variable".

But I notice on Wikipedia that a bit more is required; the change of variable function should be a diffeomorphism, we need differentiability (and even smooth manifolds for the domain and the image).

This is where I realized that I was never taught a proof of why the change of variables method work or how it works but I was just applying these substitutions blindly.

So could someone kindly point me to a source where I can improve my understanding on this very powerful method by adding rigour to what I am doing and possibly even a geometric interpretation.

The only reason you need differentiability is so that you can use change-of-variables (i.e. $u$-substitution) when integrating. If $u=f(x)$ is not differentiable, you can't express $du$ in terms of $dx$, so integration becomes a problem. — mweiss, Mar 25 '16 at 13:22
The Wikipedia page is bad. It doesn't make clear in the "formal introduction" section that "change of variables" means different things in different settings. In the setting of multivariable calculus and smooth manifolds, it always means a diffeomorphism. But in algebra, of course your change of variables doesn't need regularity conditions like continuity or differentiability. Changing variables is a general problem-solving strategy, not a precise method, but it can be made precise in particular situations. — symplectomorphic, Mar 25 '16 at 14:35
For example, from one point of view, linear algebra is simply the study of linear changes of variables. More generally, a useful problem-solving heuristic is: if you only pick the right variables, the problem will become trivial. (That is the big idea of linear differential equations, Fourier analysis, etc... You just have to interpret "changing variables" in different ways.) — symplectomorphic, Mar 25 '16 at 14:38

Ethan Bolker · Answer 1 · 2016-03-25T19:30:51.910

Good question, good answers.

As @EricS points out, you don't have to substitute in this particular case - you can do all the work with the original variable. But you can substitute. The advantage is that changing the name of the variable in a systematic way makes the shape of the problem and solution a little clearer.

Since all you're doing here is finding isolated solutions, all you need is bijectivity to make sure your translation from one name space to another and back is faithful.

If you want to do more than algebra you may need a better dictionary - that is, a substitution with better properties. If as @mweiss comments you want to do calculus on the transformed equation then the substitution and its inverse must be differentiable. That's essentially what the wikipedia page is saying, in a more abstract context.

When you study abstract algebra you'll want your "substitutions" to respect the algebraic properties of the domain and range. That's the essence of @asymplectomorphic 's comment about linear algebra.

With sufficient care in the discussion, bijectivity isn't required. — , Mar 25 '16 at 14:17
@YvesDaoust True, as you note in your answer, but taking care here might obscure the pedagogical/philosophical point I wanted to make. — Ethan Bolker, Mar 25 '16 at 14:20
Demanding bijectivity would be problematic for the very similar case $x^4-9x^2+8=0$. — , Mar 25 '16 at 14:26

score 10 · Answer 2 · answered Mar 25 '16 at 13:17

I must admit I never gave such much thought into this method as you did, and that I do not understand the formal introduction on the Wikipedia page (I assume you refered to this one). But perhaps this may provide you with some more insight into why the method works.

Again consider the equation $x^6-9x^3+8=0$. $$ x^6-9x^3+8=\left(x^3\right)^2-9\left(x^3\right)+8=0 $$ $$\iff$$ $$ \left(\left(x^3\right)-8\right)\left(\left(x^3\right)-1\right)=0 $$ $$\iff$$ $$ x^3=8\ \lor\ x^3=1 $$ $$\iff$$ $$ x=\sqrt[3]{8}=2\ \lor\ x=1 $$ Now, we both know this is just substitution without writing it explicitly, but the reason it works is because we simply rewrite the equation in an attractive form and solved a quadratic equation. So I'd say the reason substitution works depends on what kind of equation you're solving and the technique you use for solving such equations. In the above case, substitution worked, because you aren't changing the original equation at all, merely rewriting it, and because the quadratic formula works for solving quadratic equations.

Kind of an non-mathematical answer. Again, I hope it helps. If not, please forgive me :)

score 10 · Accepted Answer · 2016-03-25T13:51:55.037

10

By the rules of algebra,

$$x^6 - 9 x^3 + 8 = 0$$

is strictly equivalent to

$$(x^3)^2 - 9 x^3 + 8 = 0.$$

Then it makes no harm to substitute $u=x^3$ and solve

$$u^2-9u+8,$$

leading to a set of solutions $u\in S=\{s_k\}$. And this is equivalent to $x^3\in S$, or $x\in\{\sqrt[3]s_k\}$.

What matters for the substitution to be valid is that the domain of $u$ includes the range of $x^3$ so that no solution is lost (some $x^3$ verifying the equation but not covered by $u$); on the other hand, no alien solution is introduced when inverting $u=x^3$, as the domain of $x$ takes precedence.

I don't think that any other condition, such as continuity or differentiability, need to be imposed on the substitution.

For the sake of the illustration, let us consider the substitution $u=x^3-\text{sign}(x)$, which is neither continuous nor invertible. We have a branch with $x<0,u<1$, and another with $x>0,u>-1$.

The equation is split for the two branches

$$u<1\land(u-1)^2-9(u-1)+8=u^2-11u+18=0,\\ u>-1\land(u-1)^2-9(u+1)+8=u^2-7u=0.$$

These give the solution sets $u\in\{\}$ (no $u$ is admissible) and $u\in\{0,7\}$. Then for the second branch, $x^3\in\{1,8\}$, which is correct.

edited Mar 25 '16 at 13:51

answered Mar 25 '16 at 13:36

Could you elaborate a bit on the "branches" you talk about? I think that if I understand your last example I will be set. – Monolite Mar 25 '16 at 17:19
@Monolite: because of the $\text{sign}$ function, you cannot treat $u(x)$ as a whole. – Mar 25 '16 at 17:29
Don't you also need a branch $x=0,u=0$ to verify that that doesn't give a solution? – hvd Mar 26 '16 at 09:36
@hvd: I remained vague about the case of $x=0$, where the sign function can be defined as $0$ or $1$ depending on the mood, and considering that the branches $u\in(-\infty,1)$ and $u\in(-1,\infty)$ have a sufficient overlap. and obviously $x=0$ isn't a root. But you are right, rigor should be improved. – Mar 26 '16 at 16:50

Jasper · Answer 4 · 2016-03-26T15:20:37.680

This answer will give an idea on why change of variables works/is allowed when integrating.

I will give you an idea (heuristic) on why these requirements for $u$ are needed. The basic idea behind substition of variables is that you choose a different basis over which you know the solution.

Basically you want to solve $\int_a^b f(x) dx$.

If you think about a two-dimensional Euclidean $(x,y)$-grid (actually: manifold), then you can think of $dx$ as a vector (actually: covector) that defines the direction-step in the $x$-direction.

In the integral expression, $x$ is just a dummy, so you can choose it as anything you would like it to be, but then you need to change it at any place.

You've chosen $u = x^{\frac{1}{3}}$. This function is "sufficiently nice" in the sense that you can invert it, differentiate it infinitely many times and that it's continuous over $\mathbb{R}$ and everything else.

You can replace $dx$ now by $[\text{Something}]du$. You know $x = u^3$, so $x'(u) = \frac{dx}{du} = 3 u^2$. Multiply both sides by the differential of $u$ to find $dx = 3 u^2 du$.

Now you can transform everything from the basis in $x$ to a basis in $u$, so $$\int_a^b f(x) dx = \int_{a^{1/3}}^{b^{1/3}} f(u) (3 u^2 du)$$

Why do you need differentiability? For instance, consider $u = \frac{1}{x}$ and suppose that the point $x=0$ is within the interval $\langle a,b\rangle$. What is the value of $u$ when we consider $x = 0$?

The same way diffeomorphism. Suppose that the basis transformation is not injective. For instance consider $u = x$ if $x<0$ and $u = x+2$ if $x\geq 0$. There is no value of $x$ that maps to $u=1$. Now try to integrate over $x$ from -1 to 1. Then $\int_{-1}^1 f(x) dx = \int_{-1}^{3} f(?) du$. What would be the value of the latter integral?

Please add a note at the start of your answer stating that it is about change of variables for integration, because that is not what the question was about! — user21820, Mar 26 '16 at 15:18

Why does changing variables work?

4 Answers4