0

In "An introduction to Statistical Learning with Applications in R" on page 139, section 4.4.2, we have that

$$ p_k (x) = \frac{\pi_k f_k (x)}{\sum_{l=1}^K \pi_l f_l(x)} $$

If we assume that $f_k(x) \in N(\mu_k, \sigma_k^2)$, and further we assume that $\sigma_1 = ... = \sigma_k$, then we have that

$$ p_k (x) = \frac{(\pi_k/\sqrt{2 \pi} \sigma) exp(-\frac{1}{2\sigma^2}(x-\mu_k)^2)}{\sum_{l=1}^K \pi_l/\sqrt{2 \pi} \sigma) exp(-\frac{1}{2\sigma^2}(x-\mu_l)^2)} $$

We want to assign $X=x$ to the class for which the equation above is largest. Taking the log of it and rearranging the terms, we should be able to show that this is equivalent to assigning the observation for which

$$ \delta_k (x) = \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2\sigma^2}+log(\pi_k) $$

is largest. I can't seem to prove this. I have tried taking the logarithm of the numerator of $p_k(x)$, this is what I got;

$$ log (p_k(x)) = \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2 \sigma^2} + log(\pi_k) - \frac{x^2}{2 \sigma^2} - log(\sqrt{2\pi} \sigma)$$

which gives two terms too much. When the same is done for the denominator the result is the same, however there is a sum outside which should run for $K \geq 2$ times. Any help to get the expression $\delta_k (x)$ would be greatly appreciated, I have missed something but I can't quite get it

armara
  • 379

1 Answers1

0

When calculating the discriminant function, we only look at the terms dependant on $k$, since the other terms will be the same for all $k$'s. This is why, even though we know that

$$ log (p_k (x)) = \frac{x\mu_k}{\sigma^2} - \frac{\mu_k^2 + x^2}{2\sigma^2} + log(\pi_k) - log(\sqrt{2 \pi \sigma^2}) $$

We can remove the terms $\frac{x^2}{2\sigma^2} - log(\sqrt{2 \pi \sigma^2})$, because the are the same for all $k$. However, in the case of QDA, where $\sigma_1^2 \neq ... \neq \sigma_k^2 $, meaning we can't assume similar variance across all terms, these two terms cannot be removed. So finally, we get

$$ \delta_k (x) = \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2\sigma^2}+log(\pi_k)$$

armara
  • 379