The extra assumption is not necessary and can be derived from the four assumptions you gave.
We will use following notation in our proof:
$T$, the total length over which we compute the Poisson distribution
$P(K=k; \Delta t)$: the probability of exactly $k$ events happening over the interval of length $\Delta t$.
$E(K; \Delta t)$: the expected number of events happening over the interval $\Delta t$.
Proof
Let's split the interval $T$ into $n$ smaller intervals $\Delta t_n = T/n$. Let's denote $p_n = P(k > 0, T/n)$ the probability of at least one event happening within the smaller interval.
The expected number of events happening over the interval $\Delta t_n$ is at least $p_n$, as can be seen:
$$E(K; \Delta t_n) = \sum_{k=1}^{\infty} k \cdot P(K=k; \Delta t_n) \geq \sum_{k=1}^{\infty} P(K=k; \Delta t_n) = p_n $$
Expected value of $K$ over the full time interval $T$ is the sum of expected values over all the smaller $\Delta t$ intervals. Combined with the above inequality, we get the following upper bound for $p_n$:
$$p_n \leq E(K, T) / n$$
Now, probability of no event happening over the whole time period $T$ is the probability of no event happening in any of the $n$ small $\Delta t_n$ intervals. As these intervals are non-overlapping, we can assume the independence:
$$P(k = 0, T) = (1-p_n)^n \in (0, 1]$$
Note that the $P(k = 0, T) = 0$ would mean $p_n = 1$, which would lead to $E(K, T) \geq n \cdot 1$ for all $n \in \mathbb{N}$, which contradicts our assumption of the finite expected value.
Given that $P(K = 0, T) > 0$, we can find such a $\mu \in \mathbb{R^+}$ that $P(K=0, T) = \exp(-\mu T)$. Then:
$$p_n = 1 - e^{-\mu \Delta t_n}$$
And:
$$\lim_{n\to\infty} \frac{p_n}{\Delta t_n} = \mu$$
Obviously $P(K > 0; \delta t)$ is non-decreasing in respect to $\delta t$. The event has simply more opportunity to happen during the longer interval. This means that for any $\delta t$ we can bound $P(K > 0; \delta t) / \delta t$ with:
$$\frac{p_n}{\Delta t_{n+1}} \geq \frac{p_n}{\delta t} \geq \frac{P(K > 0; \delta t)}{\delta t} \geq \frac{p_{n+1}}{\delta t} \geq \frac{p_{n+1}}{\Delta t_n}$$
But $\Delta t_n / \Delta t_{n+1} \to 1$ as $n \to \infty$, so both bounds tend to the same limit and thus:
$$\lim_{\delta t \to 0} \frac{P(K > 0; \delta t)}{\delta t} = \mu$$
In other words:
$$P(K > 0; \delta t) = \mu \delta t + o(\delta t)$$
Corollary of the above that the probability of no event happening over any interval $\tau$ is precisely $P(K=0, \tau) = \exp(-\mu \tau)$, whether $\tau$ is big or small.
Now, we would like to show that:
$$P(k > 1, \delta t) = o(\delta t)$$
Let's assume that wouldn't be true and there would be a constant $\nu > 0$ such that:
$$\forall \epsilon > 0: \exists \delta t < \epsilon: P(K > 1, \delta t) > \nu \delta t$$
Let's split some interval $\tau$ to $n+1$ intervals as $n \times \delta t + 1 \times r$, where $r < \delta t$. Probability of two or more events happening is:
- Probability $P_\textrm{same}$ of two or more events happening within one of the $n$ intervals and no event happening elsewhere:
$$P_\textrm{same} \geq n (\nu \delta t + o(\delta t)) e^{-\mu \tau} + o(\delta t) = \nu \tau + o(\tau)$$
- Probability $P_\textrm{distinct}$ of one or more events happening in at least two distinct intervals:
$$P_\textrm{distinct} \leq
\frac{n (n+1)}{2} (\mu \delta t + o(\delta t))^2 = \tfrac{1}{2}(\mu \tau)^2 + o(\tau)$$
Given that:
$$P(K > 1, \tau) = P_\textrm{same} + P_\textrm{distinct}$$
We can always chose such a small $\tau$ that $P_\textrm{same}$ conditional on $P(K > 1, \tau)$ will be higher than any $p \in (0, 1)$ we desire (assuming $\nu > 0$).
But $P_\textrm{same}$ assumes that two events happened within a single interval of length $\delta t$, and we can pick $\delta t$ as small as we want, while keeping the $\tau$ same! If we denote $X$ as the distance between the two events happening, then there is a non-zero probability that $X = 0$. And that is in contradiction with the assumption that no two events can happen at the same time.
Therefore, $\nu = 0$, and
$$P(K > 1, \delta t) = o(\delta t)$$
We already showed that:
$$P(K > 0, \delta t) = \mu \delta t + o(\delta t)$$
Combined with the result we just proved, we get that:
$$P(K = 1, \delta t) = \mu \delta t + o(\delta t)$$
Thus we derived the "additional assumptions" from the question.