It differs in two key ways.
- Suppose the function is not differentiable.
- For multivariate functions, the derivative describes a multivariate quantity, but we want to intuit this to a scalar "amount." Consider the following example.
Suppose we're given a system of equations:
$$\begin{align*} ax + by &= 0, \\ cx+dy &= 0.\end{align*}$$
Defining $\mathbf{f}(\mathbf{x}) = \begin{pmatrix} ax+by \\ cx+dy \end{pmatrix}$, we can compute the jacobian matrix:
$$J(\mathbf{x}) = \begin{pmatrix} a & b \\ c & d\end{pmatrix}.$$
As you might know, the Jacobian matrix is the multivariable analogue to the derivative. This makes sense, because we could alternatively write our system as
$$\begin{pmatrix} a & b \\ c & d\end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix}.$$
"Differentiating" this system in a symbolic sense makes it clear that the Jacobian matrix is like the derivative.
However, how do we translate $J(\mathbf{x})$ into an "amount"? How does it relate to "how much" the function changes due to small input in $x,y$? In a scalar sense, it doesn't, because the Jacobian encodes changes in each component of a function due to a change in each component of the independent variable. We'd really like to figure out a good way to mash this all together to get a single number that represents a sort of overall sense of change.
One way to do this is with a matrix norm. In fact, the matrix condition number is directly related to matrix norms. But we can't do this for general nonlinear functions, so we need a new sense of how to measure such a property.