0

I'm currently learning about mean squared error and gradient descent and one of things that's tripping me up is how the mean squared error is continuous.

![MSE=1N∑i=1n(yi−(mxi+b))2]

I'm trying to imagine a scenario with a cost function that involves only one variable and no constant. I can see that change the coefficient of this variable little by little would have minuscule effects on the summation function, but I'm having a hard time imagining that the function is entirely continuous. I'm not good at math so I may not be communicating myself clearly.

db2791
  • 197
  • Continuous functions compose. Moreover the sum and product of continuous functions is again continuous. Can you see how this helps to deduce that MSE is continuous? – Jonas Linssen Jun 21 '20 at 15:41
  • All polynomials $\mathbb{R}^n\to\mathbb{R}$ are continuous functions – MPW Jun 21 '20 at 15:47
  • @PrudiiArca I want to make sure I understand correctly. Does this mean that because yi - y~i is continuous, as is the square of that difference, this means that the summation is also continuous? If so, that does start to make a little sense – db2791 Jun 21 '20 at 15:49
  • @db2791 yes indeed. I just made this an explicit answer. – Jonas Linssen Jun 21 '20 at 16:00

2 Answers2

1

We can construct the MSE formula as a composite of continuous functions as follows.

The functions $$\begin{align*} f_i:&\Bbb R \rightarrow \Bbb R, y_i \mapsto y_i - \widetilde{y_i}\\ s:&\Bbb R \rightarrow \Bbb R, z \mapsto z^2\\ \Sigma:&\Bbb R^n \rightarrow \Bbb R, (w_1,...,w_n) \mapsto \sum \limits_{i=1}^n w_i \end{align*}$$ are all continuous. Hence the composite $$\Bbb R^n \xrightarrow{(f_i)_i} \Bbb R^n \xrightarrow{(s)_i} \Bbb R^n \xrightarrow{\Sigma} \Bbb R$$ is continuous.

Jonas Linssen
  • 11,016
1

If you are using this loss function for regression with a line mean squared error is basically
$$1/n\sum_{i=1}^n(y_{i}-wx_{i}-b)^2$$ Notice that although $y_{i}$ and $x_{i}$ are discrete we want $w$ and b to inputs to this function So w,b are basically like the x,y in $x^2 + y^2$ a continuous range of inputs (it is easy to see that MSE is continuous in w,b its differentiable)and so by using gradient descent we want to pick the values of w,b that minimise this function.

Vivaan Daga
  • 5,531