For a M/M/1/K System, why is utilization = $\rho=\lambda T_s$

Question

Before I ask my question these are the symbols I used:

$w$ = Customers waiting in queue
$\mu$ = Service rate
$T_s = 1/\mu$ = Average service time
$\lambda$ = Average arrival time
$\rho$ = Server utilization

How come $\rho=\lambda T_s$ for a M/M/1/K System (which is the same for as M/M/1)

This is my intuitive reasoning, please tell me where I am going wrong.

Picture a M/M/1/3 queue system.

Assume at one point the queue is full. i.e w=3. At this point, a task (n) is born. This will cause this new task to be rejected.
Now assume a few moments later, the server has processed all the 3 tasks (w=0) and a new task hasn't arrived yet. So the server will be idle until a new task (n+1) arrives

Now in the scenario above, if it was a M/M/1 queue, the task (n) wouldn't have been rejected and so the server wouldn't be idle as shown in the second bullet point above. In other words, a M/M/1 system would serve task (n) and (n+1) where as a M/M/1/3 system would serve only task (n+1)

So clearly, in a M/M/1/K system, the server has more chances to have idle periods, thus lowering its utilization when compared to a M/M/1 system.

So again, $\rho=\lambda T_s$ for a M/M/1/K system doesn't make sense to me. What am I missing?

score 5 · Answer 1 · answered Jun 15 '17 at 13:26

In an $M/M/1/K$ process, where $P_n$ is the proportion of time $n$ people are queuing or being served and $0 \le n \le K$, you can reach balance with

$$P_n=\left(\dfrac{\lambda}{\mu}\right)^n P_0 $$

and, if you write $\rho = \frac{\lambda}{\mu}$ as the ratio of the arrival and service rates, then this gives $P_n=\rho^n P_0$. Since $\sum_{n=0}^K P_n=1$, this results in $$P_n=\rho^n \dfrac{1-\rho}{1-\rho^{K+1}}$$

which with $K \to +\infty$ and $0 \le \rho \lt 1$ would give the $M/M/1$ result of $P_n \to \rho^n (1-\rho)$

With finite K, you no longer have $1-P_0=\rho= \frac{\lambda}{\mu}= {\lambda}T_s$ as the expected proportion of time the server is working, which seems to be the point of your question. Instead you have $$1-P_0=\rho\dfrac{1-\rho^{K}}{1-\rho^{K+1}}$$ (with $1-P_0=\frac{K}{K+1}$ when $\rho=1$), which is less than $\rho$, meaning that the server expects to work less when the queue has limited capacity, much as you might intuitively expect

score 2 · Answer 2 · answered Jan 20 '16 at 08:29

For the $M/M/1/K$ model, you should not interpret $\rho = \lambda/\mu$ as the fraction of time the server is working (utilized). It turns out that $\rho$ is just an easy parameter to work with when you want to derive the equilibrium distribution of the associated Markov process.

Let $p_i$ be the equilibrium probability of having $i$ customers in the system. Then by PASTA, the arrival rate of customers that enter the system is $\lambda(1 - p_K)$ (customers that arrive and see $K$ customers already in the system are rejected). Thus, the utilization (fraction of time the server is working) is

\begin{equation} \text{utilization} = \frac{\lambda(1 - p_K)}{\mu}. \end{equation}

You can also derive this quantity by noting that a fraction $p_0$ of time the server is not working, and in all other states the server is actually working. So, we also have

\begin{equation} \text{utilization} = 1 - p_0. \end{equation}

score 0 · Answer 3 · answered May 02 '23 at 14:14

To better fix ideas, it is interesting to note that the utilization of the output link (or the utilization of the server, it's the same thing) can be computed in both ways and leading to the same results, i.e.:

Utilization is the percentage a resource is not idle: $U = 1 - P_0 = \rho\dfrac{1-\rho^{K}}{1-\rho^{K+1}}$

Utilization is the rateo of net arrival rate and service rate: $U = \dfrac{\lambda(1-P_K)}{\mu} = \rho\dfrac{1-\rho^{K}}{1-\rho^{K+1}}$

For a M/M/1/K System, why is utilization = $\rho=\lambda T_s$

3 Answers3