Here's a method that uses the machinery of Markov chains and that incidentally gives us a little more information.
Let $M$ be the number of cakes that we start with and, with probability $p$ reset to. Then, our system has $M + 1$ states, namely, $0$ cakes left, $1$ cake left, ... , $M$ cakes left. The $0$ state is absorbing---once we get there we're stuck there forever. For any other state, say, $m$ cakes left, we have a probability $p$ of resetting to $M$ cakes and a probability $1 - p$ of eating a cake without resetting, that is, moving the state with $m - 1$ cakes left. Ordering our states in increasing number of cakes, our probability transition matrix is
$$P = \pmatrix{1 & 0 & \cdots & 0 & 0 & 0 \\ 1 - p & 0 & \cdots & 0 & 0 & p \\ 0 & 1- p & \cdots & 0 & 0 & p \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots \\ 0 & 0 & \cdots & 1-p & 0 & p\\ 0 & 0 & \cdots & 0 & 1 - p & p} .$$
Since all of our absorbing states are listed first, this transition matrix is in the canonical form
$$\pmatrix{I_r & 0 \\ \ast & Q},$$
where in our case $r = 1$ and $Q$ is the lower-right $M \times M$ minor of $P$.
Then, the so-called fundamental matrix is the matrix $$N := I_M + Q + Q^2 + \cdots = (I_M - Q)^{-1},$$
and it has a useful concrete interpretation: Its $(i, j)$ entry $n_{ij}$ is the expected number of steps that a process starting in the $i$th state will spend in the $j$th state before being absorbed. In particular, the row sum $\sum_{j = 1}^M n_{ij}$ is the expected number of steps that a process starting in the $i$th state will take to be absorbed---in our case, finish eating all the cakes.
We're only interested in the fate of the process starting with $M$ cakes, that is, the entries of the $M$th row. Extracting the $(M, m)$ entry with Cramer's Rule gives that the expected number of steps in the state with $m$ cakes left is simply
$$n_{M, m} = \frac{1}{(1 - p)^m} ,$$
and so the expected total number of cakes eaten is given by the geometric sum
$$\sum_{m = 1}^M n_{M, m} = \color{#df0000}{\boxed{\sum_{m = 1}^M \frac{1}{(1 - p)^m} = \frac{(1 - p)^{-m} - 1}{p}}} .$$
In our case, $n_0 = 3, p = 0.17$, and substituting gives
$$E(3) = 4.40531\!\ldots ,$$
which agrees with the result of your Monte Carlo simulation to within $0.01$.