5

I am currently studying the Skiena `Algorithm Design Manual' and need a little help with a proof of correctness. The problem goes as follows:

Prove the correctness of the following algorithm for evaluating a polynomial. $P(x)=a_nx^n+a_{n-1}x^{n-1}+\ldots+a_1x+a_0$

function horner($A,x$)

  1. $p=A_n$
    • for $i$ from $n-1$ to $0$
      • $p=p*x+A_i$
    • return $p$

It is intuitively obvious, that this algorithm gives the right result. But as I want a proof of correctness, I have to make sure this becomes obvious. My idea is proof by induction. We want to use, rigorously, that the inductive hypotheses is true up to and including step $n-1$.

To consider the algorithm, let $(a_i)_{i\in\mathbf{N}_0}$ be some sequence of real numbers and let $P$ be a function on $\mathbf{N}_0\times\mathbf{R}$ into $\mathbf{R}$ given by $$\forall (n,x)\in \mathbf{N}_0\times\mathbf{R} , \quad P(n,x)=\sum_{j=0}^n a_jx^j.$$

So let us consider the base case of $n=0$. This case is no problem, and the algorithm exites with $P(0,x)=a_0$ for all $x\in\mathbf{R}$.

Now, for the inductive step, assume the algorithm holds for all steps up to and including step $n-1$ for some arbitrary $n\in\mathbf{N}$. I want to express the iterative scheme symbolically, but of course, it is a nonsense to write $p=p*x+A_i$, since this just yields a function given by $p=A_i/(1-x)$ (when the division is well-defined). So I will give up the notation of the algorithm.

For this arbitrary $n$, what we want to show is that $$\{[(a_nx+a_{n-1})x+a_{n-2}]x+\ldots\}x+a_0=\sum_{j=0}^n a_jx^j.$$

It is, as I said above, intuitively clear that this equation holds true, but I feel the left hand side is a bit too ambiguous. My problem is that I do not know how to express the left hand side of nested multiplication and summation in unambiguous notation (a product-operator and summation-operator). This is where I would like some help,

Thanks in advance

  • 1
    Write $P(n,x) = \tilde{P}(n-1,x) \cdot x + a_0$. – Daniel Fischer Jul 05 '13 at 12:34
  • This would entail that $\tilde{P}(n-1,x)=\sum_{j=1}^{n}a_{j}x^{j-1}$. I am not sure I understand why this is the inductive hypothesis. I think I am being confused by the `reversed time' induction from $n-1$ to $0$. Could you perhaps elaborate a bit. Thanks again. – semicolon Jul 05 '13 at 12:45
  • 1
    It's a polynomial of degree $n-1$. By the induction hypothesis, Horner evaluates that correctly. – Daniel Fischer Jul 05 '13 at 12:48
  • Yes, I see that it is a polynomial of degree $n-1$. I was wondering why it has no degree zero term. I guess it is not supposed to have one, but I do not quite yet understand why. Thanks for your help. – semicolon Jul 05 '13 at 12:55
  • The constant (degree zero) term of $\tilde{P}$ is $a_1$. Maybe it'll become clearer if you rename the coefficients, $\tilde{P}(n-1,x) = \sum_{k=0}^{n-1} b_k x^k$, with $b_k = a_{k+1}$. – Daniel Fischer Jul 05 '13 at 13:00

2 Answers2

2


Induction Hypothesis:


  • $\mathbf{Consider}$ the polynomial of $\mathbf{n^{th}}$ degree $\mathbf{P_n}$, such that $$\mathbf{P_n} = a_nx^n + a_{n-1}x^{n-1} + ... + a_1x + a_0$$

    Such that $\mathbf{A_n} = {a_n, a_{n-1}, a_{n-2},..., a_1, a_0}$ is the set of all coefficients of $\mathbf{P_n}$


  • $\mathbf{Then}$ we can define the $\mathbf{Honer}$ function of $\mathbf{P_n}$ as follows:

    $$ \mathbf{horner(A_{n-i}, x)} = \begin{cases} a_n, & \text{; $i = n$} \ \mathbf{horner(A_{n-i-1}, x)} \cdot x + a_i, & \text{; $\forall$ i such that $0 \leq i \lt n $} \end{cases} $$


Base case:


  • $ \mathbf{horner(A_1, x)} = \mathbf{P_1} = a_nx + a_0 $
  • Assume this is true up till $\mathbf{A_n}$, then $$ \mathbf{horner(A_n, x)} = (... (a_nx + a_{n-1} )x + a_{n-2})x + ... + a_1)x + a_0 ) $$

Induction step:


  • You can show that $\mathbf{horner(A_{n}, x)} \implies \mathbf{horner(A_{n+1}, x)}$ in two ways.
  • The first one, is by using the induction hypothesis definition. So, starting with $i = 0$

    \begin{align} \mathbf{horner(A_{n+1}, x)} & = {\color{red}{\mathbf{horner(A_{(n+1)-1}, x)}}} \cdot x + a_0 \ & = {\color{red}{(\mathbf{horner(A_{(n+1)-2}, x) \cdot x + a_1)}}} \cdot x + a_0 \ & = \vdots \text{ for k-times} \ & = (\mathbf{horner(A_{(n+1)-{\color{red}{k}}}, x) \cdot x + a_{{\color{red}{k}}-1})} \cdot x + ... + a_1)x + a_0 \ \end{align}

    Assume $ \mathbf{horner(A_{(n+1)-k}, x)} = \mathbf{horner(A_0, x)} $
    Then $ (n+1)-k = 0 \implies (n+1) = k $
    Substitute in back in the equation gives,

    \begin{align}

    \mathbf{horner(A_{n+1}, x)} & = (\mathbf{horner(A_{{\color{red}{k}}-k}, x) \cdot x + a_{{\color{red}{n+1}}-1})} \cdot x + ... + a_1)x + a_0 \ & = ( {\color{red}{\mathbf{horner(A_{0}, x)}}} \cdot x + a_{n}) \cdot x + ... + a_1)x + a_0 \ & = ({\color{red}{a_{n+1}}} \cdot x + a_{n}) \cdot x + ... + a_1)x + a_0 \

    \end{align}


  • The other way is using the fact that $ \mathbf{horner(A_{n+1}, x)} = \mathbf{P_{n+1}} $

    Consider $\mathbf{P_{n+1}}$, we can define it as follows:

    \begin{align}

    \mathbf{P_{n+1}} & = {\color{red}{a_{(n+1)}x^{(n+1)} + a_{n}x^{n}}} + a_{(n-1)}x^{(n-1)} + ... + a_1x + a_0 \ & = {\color{red}{(a_{(n+1)}x + a_n)x^n}} + a_{(n-1)}x^{(n-1)} + ... + a_1x + a_0 \ & = {\color{red}{((a_{(n+1)}x + a_n)x + a_{(n-1)})x^{(n-1)}}} + a_{(n-2)}x^{(n-2)} + ... + a_1x + a_0 \ & = \vdots \text{ keep factoring out k-times} \ & = {\color{red}{(...(a_{(n+1)}x + a_n)x + a_{(n-1)})x}} + ... + a_{(n-k)})x^{(n-k)} + ... + a_1x + a_0 \

    \end{align}

    It turns out that the non-red highlighted terms are nothing but $\mathbf{horner(A_{n-k}, x)}$

    \begin{align}

    \mathbf{P_{n+1}} & = {\color{red}{(...(a_{(n+1)}x + a_n)x + a_{(n-1)})x}} + ... + \mathbf{horner(A_{n-k}, x)} \

    \end{align}

    Assume $ \mathbf{horner(A_{n-k}, x)} = \mathbf{horner(A_1, x)} $
    Then $ n-k = 1 $
    Substitute in back in the equation gives,

    \begin{align}

    \mathbf{P_{n+1}} & = {\color{red}{(...(a_{(n+1)}x + a_n)x + a_{(n-1)})x}} + ... + \mathbf{horner(A_{1}, x)} \ & = {\color{red}{(...(a_{(n+1)}x + a_n)x + a_{(n-1)})x}} + ... + a_1x + a_0 \

    \end{align}

    Since $\mathbf{horner(A_{n+1}, x)} = \mathbf{P_{n+1}}$
    Then $\mathbf{horner(A_{n+1}, x)} = (...(a_{(n+1)}x + a_n)x + a_{(n-1)})x + ... + a_1x + a_0$

Conclusion

$$\mathbf{horner(A_n, x)} \implies \mathbf{horner(A_{n+1}, x)}$$

Shaaer
  • 121
  • 5
  • How would you go about proving that the algorithm as written in the question returns the same value as the recursive function that you defined? – Daniel Schepler Jan 23 '20 at 22:19
  • Actually they are both the same, the induction hypothesis is nothing but the formal definition of the algorithm specifically, the line p = p*x + Ai. The piecewise definition is just applying the loop boundaries. – Shaaer Jan 23 '20 at 23:59
2

In arguing about the properties of algorithms involving a loop, it is often useful to think in terms of a loop invariant: something that you expect to be true after each iteration of the loop. Then, the principle is: if you have a loop where some proposition $P$ related to the current state is true before starting execution of the loop; and for each iteration of the loop, if before the loop body $P$ is true, then after the loop body $P$ is true; then, after the loop terminates, $P$ is still true.

In this algorithm, let us first expand the "for" statement a bit in order to simplify the analysis:

  • function horner(A, x):
    • $p := A_n$
    • $i := n$
    • while $i > 0$:
      • $i := i - 1$
      • $p := p \cdot x + A_i$
    • return $p$

Then the loop invariant we will use is: $$p = \sum_{j=0}^{n-i} A_{n-j} x^{n-i-j} = A_n x^{n-i} + A_{n-1} x^{n-i-1} + \cdots + A_{i+1} x + A_i.$$ Just before the "while" statement starts executing, we indeed have $i = n$ and $p = \sum_{j=0}^0 A_{n-j} x^{n-i-j} = A_n$. Now, when executing the body, let us suppose at the start that $i = i_s$ and $p = A_n x^{n-i_s} + A_{n-1} x^{n-i_s-1} + \cdots + A_{i_s}$. (Here, $i_s$ will be used for the "start" value of $i$ in order to avoid confusion due to the changing value of $i$.) Then after the loop body executes, we will have $i = i_s - 1$ and $$p = (A_n x^{n-i_s} + A_{n-1} x^{n-i_s-1} + \cdots + A_{i_s}) x + A_{i_s-1} = \\ A_n x^{n-i_s+1} + A_{n-1} x^{n-i_s} + \cdots + A_{i_s} x + A_{i_s-1} = \\ A_n x^{n-i} + A_{n-1} x^{n-i-1} + \cdots + A_{i+1} x + A_i.$$ Therefore, the condition will indeed be preserved by each iteration of the loop. But then, after the loop exits, we will have $i = 0$, so $p = A_n x^n + A_{n-1} x^{n-1} + \cdots + A_1 x + A_0$ and the next "return" statement will return the correct value.

(Incidentally, if you're designing an algorithm, then the concept of loop invariant can also be useful in reverse: to come up with a description of exactly what state you want the variables to be in at the end of each execution of the loop body. Then keeping that in mind will aid in avoiding bugs that are due to "jumping the gun", so to speak, and prematurely performing calculations that really belong to the next iteration.)