Two issues are involved:
- Is the negative binomial the distribution of the waiting time until a specified number of "successes" or until a specified number of "failures"?
- Is it the number of trials needed to get the specified number of "successes" (or, as the case may be, of "failures") or is it the number of trials that do not result in that many successes, or failures, before that number of "successes" or "failures" happens?
So we have four possibilities:
- It's the number of trials needed to get $r$ successes. Then we have
$$
\mathbb E(X) = \frac rp, \quad \operatorname{var}(X) = \frac{rp}{(1-p)^2}.
$$
- It's the number of failures before the $r$th success. Then we have
$$
\mathbb E(X) = r\frac{1-p}{p},\quad \operatorname{var}(X) = \frac{rp}{(1-p)^2}.
$$
- It's the number of trials needed to get $r$ failures. Then we have
$$
\mathbb E(X) = \frac r{1-p}, \quad \operatorname{var}(X) = \frac{r(1-p)}{p^2}.
$$
- It's the number of successes before the $r$th failure. Then we have
$$
\mathbb E(X) = \frac{rp}{1-p},\quad \operatorname{var}(X) = \frac{rp}{(1-p)^2}.
$$
One advantage to viewing it as the number of successes before the $r$th failure or as the number of failures before the $r$th success is that then it is an infinitely divisible distribution, and it naturally extends to the case where $r$ is not an integer. In that case it is supported on the set $\{0,2,3,4,\dots\}$. In the other cases it is supported on the set $\{r,r+1,r+2,r+3,\dots\}$.