The second statement is false as I shall demonstrate now. Consider the problem of computing the function $f : \mathbb{R} \times \mathbb{R} \rightarrow \mathbb{R}$ given by $$f(x,y) = x - y.$$
I will assume that we are using floating point numbers that respect the IEEE standard. I will assume that $x-y$ is in the representable range. Then there exists a $\delta$ such that the computed value $\hat{z}$ of the real number $z = x - y$ satisfies $$\hat{z} = (x-y)(1+\delta)$$ and $|\delta| \leq u$ where $u$ is the unit roundoff. We observe that $$\hat{z} = f(\bar{x}, \bar{y}),$$ where
$$ \bar{x} = x(1+\delta), \quad \bar{y} = y(1+\delta).$$
This shows that the computed value of $z$ is the exact value of $f$ evaluated at a point $(\bar{x},\bar{y})$ which satisfies $$|x-\bar{x}| \leq u |x|, \quad |y-\bar{y}| \leq u |y|.$$
This shows that the naive algorithm for computing $z$ is componentwise relative backwards stable. This the highest and most demanding form of stability.
I shall now show that problem of computing $z$ is ill-conditioned in the relative sense when $x \approx y$ and well-conditioned in the relative sense when $x$ and $y$ are not close. Specifically, suppose that we desire to compute $z=x-y$, but we only have good approximations $$\hat{x} = x(1+\alpha), \quad \hat{y} = y(1+\beta)$$ of $x$ and $y$, then we cannot hope to do better than compute $\hat{x} - \hat{y}$. The relative error between the number we want and the best approximation that we can hope to compute satisfies
$$\left|\frac{(x-y) - (\hat{x} - \hat{y})}{(x-y)} \right| \leq \max\{|\alpha|,|\beta|\}\frac{|x| + |y|}{|x-y|}$$
Small values of $|\alpha|$ and $|\beta|$ reflect the fact that $\hat{x}$ and $\hat{y}$ are good approximations of $x$ and $y$, but the right-hand side will be large when $x \approx y$. In this case we have no guarantee that the relative error will be small, and in general it will be large. On the other hand if $|x| \ge 2|y|$ or $|y| \ge 2|x|$, then the triangle inequalities implies that $$ \frac{|x| + |y|}{|x-y|} \leq 3$$ and we have a guarantee that the relative error will be at most 3 times the largest relative error associated with either $x$ or $y$.
In summary, the naive algorithm for computing $f(x,y) = x - y$ is always backwards stable in the componentwise relative sense, but the problem is ill-conditioned for some input and well-conditioned for most input.
In general, there is no connection between the condition of a function $g$ and the stability of the algorithm that is used to evaluate $g$. This is a good thing! It dramatically simplifies the analysis of the software that implements the algorithm. First we analyze the function $f$. If $f$ is always well-conditioned, then a stable algorithm should return a good approximation of $y = f(x)$ when supplied with a good approximation $\hat{x} \approx x$. If there are points $x$ where $f$ is ill-conditioned, then even a stable algorithm cannot hope to return an accurate approximation of $y = f(x)$, because we are unable to provide the machine with the exact value of $x$. In general, we cannot hope to do better than $\hat{x} = \text{fl}(x)$, the floating-point representation of $x$.
I shall now demonstrate that the first statement is false. Let $f : \mathbb{R}^{n} \times \mathbb{R}^n \rightarrow \mathbb{R}^{n \times n}$ denote the outer product, i.e., $$f(x,y) = xy^T.$$
I shall now demonstrate that $f$ is well-conditioned, but that no algorithm for computing $f$ can be backwards stable.
We begin by exploring the sensitivity of $f$ to small perturbations of the input. Let $\epsilon>0$ be a small fixed number and let $\Delta x$ and $\Delta y$ be perturbations of $x$ and $y$ such that
$$ |\Delta x| \leq \epsilon |x|, \quad |\Delta y| \leq \epsilon |y|$$
These inequalities should be understood in the componentwise sense, i.e.,
$$ |(\Delta x)_i | \leq \epsilon |x_i|$$
and similarly for $y$. Then
$$ f(x + \Delta x, y + \Delta y) = xy^T + x(\Delta y)^T + (\Delta x)y^T + (\Delta x)(\Delta y)^T.$$
It follows that
$$ f(x + \Delta x, y + \Delta y) - f(x,y) = x(\Delta y)^T + (\Delta x)y^T + (\Delta x)(\Delta y)^T$$
We conclude that
$$ |f(x + \Delta x, y + \Delta y) - f(x,y)| \leq 2 \epsilon |x||y|^T + \epsilon^2|x||y|^T = (2 \epsilon + \epsilon^2) |xy|^T = (2 \epsilon + \epsilon^2)|f(x,y)|$$
Again, this inequality should be understood in the componentwise sense. In short, for every single component of $f$, then error caused by perturbing the input argument is small relative to the corresponding component of $f$. We conclude that $f$ is well-conditioned with respect to perturbations that are small in the componentwise relative sense.
Now suppose that $\hat{z}$ is the computed value of the matrix $z = f(x,y) = xy^T$. In general, we cannot have $\hat{z} = f(\bar{x},\bar{y})$! Why is that? Well, if $x, y \in \mathbb{R}^n$, then we have only $2n$ degrees of freedom, but there are $n^2$ equations to satisfy. We have to find $\Delta x$ and $\Delta y$ such that
$$\hat{z} = (x+\Delta x)(y + \Delta y)^T$$ and in general, this is not possible. In particular, there is no hope of $\Delta x$ and $\Delta y$ that are small relative to $x$ and $y$. We conclude that no algorithm for computing $f$ can be backwards stable in the componentwise relative sense.
I recommend that you set the note aside. These are the two best references that I know
- The paper "Mixed, componentwise and structured condition numbers" by Israel Gohberg and Israel Koltracht. SIAM Journal of Matrix Analysis and Applications, 1993, Vol. 14, No. 3 : pp. 688-704 https://doi.org/10.1137/0614049. The treatment of condition numbers is very systematic.
- The textbook "Accuracy and stability of numerical algorithms" by Nichlas J. Higham. SIAM 2002. https://doi.org/10.1137/1.9780898718027. This is a good book on many different topics. The treatment of the conditioning of linear systems is excellent.
They are not easy but they are good. Many of the answers that I have provided here are derived from these texts.
EDIT: I am not sure that it is possible to repair the statements in the note. I am sure that it the following is safe to teach:
If the problem is well-conditioned, then a stable algorithm will produce an accurate result. If the problem is ill-conditioned, then we cannot expect that a stable algorithm will produce an accurate result.
I can also phase it like this: If the problem is well-conditioned, then it is not theoretically impossible to write a computer program that returns accurate results. If the problem is ill-conditioned, then we cannot expect that any computer program will return accurate results.