2

Let's say we have a task whose duration without any restarts is T. And also assume the task faces exponentially distributed restarts/failures with rate of $\lambda$. After each restart it has to wait for a duration of $R$ and then start again from the beginning.

Now I want to compute the expected duration of the task considering failures. Here I found an approach which estimates the expected duration of completing the task as follows:

$E(T) = (1/\lambda)(1 - (1+T\lambda)e^{-T\lambda})(e^{T\lambda}-1)+T$

But this approach does not involve the delay $R$ after each restart. I thought maybe the most naive solution would be to simply multiply the expected number of restarts during $E(T)$ by $R$ which delivers the expected total delay duration until completion as follows:

$total$ $duration = E(T) + E(T)\lambda R$

But I believe this is not a good estimation because the probability of occurring failures during $R$ is ignored in this estimation.

Has anyone any ideas how to compute the expected completion duration?

2 Answers2

1

Using notation from this answer (see Iverson bracket), we can write the wait $W$ before the task is completed as

$$ W = \sum_{i=1}^{+\infty}\,D_i\cdot[D_1\leqslant T, D_2\leqslant T+R, \ldots,D_i\leqslant T+R], $$

where the $D_i$ are i.i.d. with exponential distribution of parameter $\lambda$ and represent the gap in time between failure $i-1$ and failure $i$ $($with failure $0$ being the very beginning, of course$)$. By independence, we have that for $i\geq 2$:

$$\mathbb E(D_i\,|\,D_1\leqslant T, D_2\leqslant T+R, \ldots,D_i\leqslant T+R) = ab^{i-2}c$$

where

\begin{align} &a = \mathbb P(D\leqslant T) = 1 - e^{-\lambda T}\\ &b = \mathbb P(D\leqslant T+R) = 1 - e^{-\lambda (T+R)}\\ &c = \mathbb E(D\,|\,D\leqslant T+R) = \int_0^{T+R}\,x\lambda e^{-\lambda x}\,dx = \frac{1-e^{-\lambda (T+R)}\big(1+\lambda (T+R) \big)}{\lambda} \end{align}

For $i=1$, we have

$$d:= E(D_1\,|\,D_1\leqslant T) = \int_0^{T}\,x\lambda e^{-\lambda x}\,dx = \frac{1-e^{-\lambda T}\big(1+\lambda T \big)}{\lambda}. $$

Hence

\begin{align} \mathbb E(W) &= d + ac\,\sum_{j\geq 0}\,b^j\\ &= d + \frac{ac}{1-b} \end{align}

If my calculations are correct, this yields

\begin{align} \mathbb E(W) &= \frac{e^{\lambda(T+R)}+\lambda Re^{-\lambda T}-\lambda(T+R)-e^{\lambda R}}\lambda \\&= \frac{e^{\lambda R}\left(e^{\lambda T}-1\right)}{\lambda} - T - R\left(1-e^{-\lambda T}\right), \end{align}

and notice that if $R=0$ we recover the original answer.

Now, besides the wait $W$, we also need an additional time $X$ for actually completing the task. If no restarts occurred, then this additional time is simply $T$, otherwise, we need $T+R$. In other words:

$$X = T + R\cdot[D_1\leqslant T]$$

with expectation

$$\mathbb E(X) = T + R\cdot \mathbb P(D_1\leqslant T) = T+Ra = T+R\left(1-e^{-\lambda T}\right).$$

Therefore, the total time needed for completing the task simplifies to

$$\mathbb E(W+X) = \frac{e^{\lambda R}\left(e^{\lambda T}-1\right)}{\lambda}$$

Does this match your simulations? \begin{equation} %------------------ % %>**The answer below is incorrect; see the comments for clarifications.** % %Let $N$ denote the number of restarts and $X$ the time between restarts. %Let $\lambda$ be the (constant) rate at which restarts occur. %Then $N$ has Poisson distribution and $X$ has exponential distribution, both %with parameter $\lambda$, and the expected time taken is % %$$\sum_{n\geq 0}\, %\underbrace{\mathbb P(N=n)}_{\text{exactly $n$ restarts occur}} %\cdot %\underbrace{\Big(T + n\left(R + \mathbb E(X)\right)\Big).}_{\text{average time %taken to complete in exactly $n$ restarts}}$$ % %Now, $\mathbb E(X) = \frac1\lambda$ and $\mathbb P(N=n) = e^{-%\lambda}\,\frac{\lambda^n}{n!}$. %Hence the expected time taken is % %$$\sum_{n\geq 0}\,e^{-\lambda}\,\frac{\lambda^n}{n!} %\cdot %\left(T + n\left(R+\frac1\lambda\right)\right).$$ % %We have % %$$Te^{-\lambda}\,\sum_{n\geq 0}\,\,\frac{\lambda^n}{n!} %=Te^{-\lambda}\,e^\lambda = T,$$ % %indicating of course that the task takes *at least* $T$ time to complete. %The time corresponding to the restarts is % %\begin{align} %e^{-\lambda}\,\sum_{n\geq 0}\,\frac{\lambda^n}{n!} %\cdot %n\left(R+\frac1\lambda\right) %&= %e^{-\lambda}\,\sum_{n\geq 1}\,\frac{\lambda^n}{n!} %\cdot %n\left(R+\frac1\lambda\right) %\\&= %e^{-\lambda}\,\sum_{n\geq 1}\,\frac{\lambda^n}{n!} %\cdot %n\,\left(\frac{R\lambda+1}{\lambda}\right) %\\&= %e^{-\lambda}(R\lambda+1)\, %\sum_{n\geq 1}\,\frac{\lambda^{n-1}}{(n-1)!} %\\&= %e^{-\lambda}(R\lambda+1)\, %\underbrace{\sum_{n\geq 0}\,\frac{\lambda^{n}}{n!}}_{e^{\lambda}} % = R\lambda + 1 %\end{align} % %which should gives us an expected time of % %$$T+R\lambda +1.$$ \end{equation}

Fimpellizzeri
  • 23,126
  • Oh, I thought $\lambda$ was the rate parameter, but I guess it is the time rate. That's some rather misleading notation. I will try and fix the answer. – Fimpellizzeri May 15 '18 at 19:45
  • I am now confused. What do you mean by 'the task faces exponentially distributed restarts/failures with rate of $\lambda$'? – Fimpellizzeri May 15 '18 at 19:56
  • What happens if a failure occurs during a restart? Do we wait out the initial delay period before waiting another delay period, or do we interrupt the current delay period to wait another delay period after which, assuming no other failures occur, we promptly begin the task? – Fimpellizzeri May 15 '18 at 20:12
  • Then my calculation is indeed incorrect. I must think on how to approach this. – Fimpellizzeri May 15 '18 at 20:28
  • I have edited with what hopefully is a closed form answer based on the answer you found. Please check if it performs well. – Fimpellizzeri May 17 '18 at 04:33
  • Oh, you are absolutely correct! I think I've fixed it foor good. Please check it out. – Fimpellizzeri May 17 '18 at 13:47
  • Yes, it matches perfectly to the simulations now! – Gongotar May 17 '18 at 13:54
  • Awesome! Thank you for the nice question and follow up. =) – Fimpellizzeri May 17 '18 at 15:36
0

I found another stack post where the expected waiting time to have a minimum inter-arrival time of $T$ between failures is defined as follows:

$W(T) = \frac{eλT−1−λT}{λ}$

This means $W(T)+T$ should deliver the expected duration of completing the task (without considering delay $R$). This opposes the calculations done here (also mentioned in the question). However my simulations show that $W(T)+T$ delivers the correct answer.

Now to change it in order to consider delay time $R$ I have done calculations as follows:

$E(x) = W(x) + x$ $\texttt{ }\texttt{ }\texttt{ }\texttt{as described above}$

$P(t > x) = e^{-\lambda x}$ $\texttt{ }\texttt{ }\texttt{ }\texttt{ the probability of having an inter-arrival}$

$\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ time greater than x (The first failure occurs}$

$\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ }\texttt{ after completion)}$

$E(T, R) = P(t > T)T + (1 - P(t > T))E(T+R)$

Which delivers the probability of completing the task without having to face any restarts (and thus having to wait extra delay $R$) multiplied by $T$ (task duration without restart delay $R$) plus the probability of facing at-least one restart and having to wait for a minimum inter-arrival time of $T+R$.

The simulations show that this is a relative good estimation.