So I am studying SICP (Structure and Interpretation of Computer Programs) and doing one of the excercises which is based on the fixed-point method for finding the fixed-point of $f(x)$. In a particular excercise question it asks to find the root of $x^x = 1000$ by the fixed point method. So I transformed it into something like $ x = \frac{\log 1000}{\log x} $, for using the right-hand side function as an input to method. So it discusses two ways to estimate the next guess one is $x_{n+1} = f(x_n)$ and the other is $x_{n+1} = \frac{1}{2}(x_n + f(x_n))$. I tried to find the fixed points of other functions also , it seems likes always the second method converges quickly. Is this always the case or just a coincidence which might fail for some other functions or is there a rigorous proof as to why this happens ?
-
1Trivial counterexample: Suppose $f$ is constant. – Jan 05 '15 at 13:34
2 Answers
Averaging in this fashion does not always accelerate the process, but I wouldn't say that it's just a coincidence either. In fact, we can determine exactly when this process will help by examining the derivative of $f$ at the fixed point.
Suppose that $x_0$ is a fixed point of $f$. Thus, $f(x_0)=x_0$. Let $\lambda=f'(x_0)$. We say that $x_0$ is
- attractive if $|\lambda|<1$,
- repulsive if $|\lambda|>1$, or
- neutral if $|\lambda|=1$.
In the attractive case, the value of $\lambda$ indicates the rate of attraction. Specifically, $$\left|f^n(x)-x_0\right| \sim |\lambda|^n\left|x-x_0\right|.$$ When $\lambda=0$, the rate of attraction is better than exponential for any base. This case is sometimes called super-attractive.
Now let $F(x)=(x+f(x))/2$. Then, $$F'(x_0) = \frac{1+\lambda}{2}.$$
Now, in your case $$f(x)=\log(1000)/\log(x), \: x_0 \approx 4.555, \text{ and } \: f'(x_0) \approx -0.659.$$ Thus, $F'(x_0) \approx 0.170259$ and we see the acceleration.
To produce a counter-example to your conjecture, we simply need a function $f$ with a fixed point $x_0$ where $f'(x_0) = 0$. To be concrete, let $f(x)=2x(1-x)$. Then $f(1/2)=1/2$, $f'(1/2) = 0$, and $F'(1/2)=1/2$. Thus, the convergence of the iteration of $f$ should be must faster than that for $F$. Lets demonstrate by iterating both from $x_1=0.4$: $$ \begin{array}{l|l|l} \text{} & f^n\text{(0.4)} & F^n\text{(0.4)} \\ \hline 0 & 0.4 & 0.4 \\ 1 & 0.48 & 0.44 \\ 2 & 0.4992 & 0.4664 \\ 3 & 0.499999 & 0.482071 \\ 4 & 0.5 & 0.490714 \\ 5 & 0.5 & 0.495271 \\ 6 & 0.5 & 0.497613 \\ \end{array} $$
In general, we can expect this averaging technique to improve the convergence exactly when $$\left|\frac{1+\lambda}{2}\right| < \min(1,\left|\lambda\right|).$$ When $\lambda$ is real this means that $-3<\lambda<-1/3$. In case that $-3<\lambda<-1$, iteration of $f$ can't be expected to converge to $x_0$ yet iteration of $F$ can.
- 30,510
-
I have basically written down the same proof with the same result. However $(1+\lambda)/2$ isn't smaller $\lambda$ if $\lambda>1/2$. For example, if $\lambda=0.9$, then $(1+0.9)/2=0.95>\lambda$. $(1+\lambda)/2$ is only greater $\lambda$ for $\lambda>1$ which is not relevant for us. I just feel like I'm missing something. – Thomas Jan 05 '15 at 13:44
-
I think you have some errors there: $F'(x_0)=(1+f'(x))/2=(1\pm\lambda)/2$, and the subsequent inequalities are incorrect in any case. It should be $|F'(x_0)|\le|f'(x_0)|$ if $f'(x_0)\le-1/3$. – Jan 05 '15 at 13:49
-
1@sonystarmap Yes you're right - the critical issue is the sign of $\lambda$. – Mark McClure Jan 05 '15 at 13:49
-
I also think, that we have to consider the sign. It is relevant, if we oscillate around the fixed point or not. – Thomas Jan 05 '15 at 13:50
-
Informally, when $1/2 < \lambda<1$ the plain iterative method is convergent but oscillates. In this case (only) the "damping" introduced by averaging the tentative new value with the old value accelerates the convergence. – leonbloy Jan 05 '15 at 13:59
-
@sonystarmap I'm not convinced that the case $\lambda>1$ is irrelevant as it may still be the case that $|(1+\lambda)/2|<1$. I'll update my answer to discuss this possibility shortly. – Mark McClure Jan 05 '15 at 14:42
-
@Rahul Thank you for pointing that out, though $f'(x_0)\leq-1/3$ isn't quite the end of the story - we'd also like $f'(x_0)>-3$. Also, your trivial counter-example of a constant function is included in the analysis here, as such point is a super-attractive fixed point. – Mark McClure Jan 05 '15 at 15:11
-
That's a nice nontrivial example. You're still defining $\lambda=|f'(x_0)|$ and using it as if $\lambda=f'(x_0)$ later though; I don't think you want to do that. – Jan 05 '15 at 15:31
-
I'd like to add another answer, since there was some confusion with the sign of $\lambda$ in the answer of Mark McClure. However, in general I agree with his answer.
Assume, that the fixed-point iteration given by $x^{k+1} =f(x^k)$ oscillates around the fixed-point $x^*$ that satisfies $x^*=f(x^*)$. This means, that if $x^*-x^k<0$ then $x^*-x^{k+1}=x^*-f(x^k)>0$ and vice versa. However, we still want $f$ to be contractive, i.e. \begin{align} |f(x)-f(y)| \leq \lambda |x-y| \end{align} with $\lambda \in [0,1)$.
Now let us consider your averaged iteration $x^{k+1} = \frac12(x^k+f(x^k))$ where the non averaged iteration oscillates. Then the error is given by \begin{align} |x^*-x^{k+1}|&=|x^*- \frac12(x^k+f(x^k))| \\ &=\frac12|x^*-x^k+x^*-f(x^k)| \\ &=\frac12|x^*-x^k+f(x^*)-f(x^k)| \end{align} We know, that $x^*-x^k$ and $f(x^*)-f(x^k)$ have different sign. W.l.o.g. assume $x^*-x^k>0$, then $0>f(x^*)-f(x^k) = -\lambda (x^*-x^k)$. Thus \begin{align} \frac12|x^*-x^k+f(x^*)-f(x^k)| &\leq \frac12 |x^*-x^k-\lambda(x^*-x^k)| \\ &=\frac12 |1-\lambda||x^*-x^k| \end{align} So we conclude, that for the original scheme, we have \begin{align} |x^*-x^{k+1}| \leq\lambda |x^*-x^k| \end{align} and for the averaged scheme \begin{align} |x^*-x^{k+1}| \leq\frac12|1-\lambda| |x^*-x^k| \end{align} So for $\lambda<1/3$ the initial scheme converges faster than the averaged, however for $1/3\leq \lambda <1$ we have $\frac12|1-\lambda|<1$. Thus the averaged scheme converges, even if the initial scheme does not if $1<\lambda<3$.
This answer should only be a small extension the answer of Marc McClure for the case where the original iteration oscillates around the fixed-point.
- 4,363