It is well known, that given two skew lines \begin{align*} g: x &= a + \lambda b \quad \\ h: y &= c + \mu d \end{align*} the distance is given by $$D = \frac{|(a - c) \cdot b \times d|}{|b \times d|}$$ See e.g. here
I thought, it would be a nice exercise to derive the distance using an optimization method, but I get stuck. Short version: Why is \begin{equation*} \begin{aligned} \left\| \left(a - c\right) - \frac{|b|^2\cdot |d|^2}{|b \times d|^2} \left[\vphantom{\frac{b}{b}}\right.\right. & \left.\left. \left(\left(a - c\right)\cdot \left(\frac{b}{|b|} - \frac{b \cdot d}{|b|\cdot|d|} \cdot \frac{d}{|d|}\right)\right) \cdot \frac{b}{|b|} \right.\right. + \\ & \left.\left. \left(\left(a - c\right)\cdot \left(\frac{d}{|d|} - \frac{b \cdot d}{|b|\cdot|d|} \cdot \frac{b}{|b|}\right)\right) \cdot \frac{d}{|d|} \right] \right\| = \frac{|(a - c) \cdot b \times d|}{|b \times d|} \end{aligned} \end{equation*}
It seems clear to me, that if you subtract the orthogonal projections of $a - c$ in direction of $b$ and $d$, the result must be proportional to $b \times d$. On the other hand, there must be a clear line of arguments, why $\frac{|b|^2\cdot |d|^2}{|b \times d|^2}$ exactly does the job.
Long version. For notation and reference see e.g. here: The difference vector between two of those skew lines is $$D = |y - x| = |c - a + \mu d - \lambda b|$$ and the local extrema $(\lambda, \mu)$ is found to be at $$\frac{1}{\mbox{det } A} \left(\begin{matrix} d^2 & d\cdot b \\ d \cdot b & b^2 \end{matrix}\right) \cdot \left(\begin{matrix} (c-a) \cdot b \\ -(c-a) \cdot d \end{matrix}\right)$$
where $$\mbox{det A} = b^2 d^2 - (d \cdot b)^2 = |b \times d|^2$$
If you insert above solution into $D$, you get
\begin{equation*} \begin{aligned} D = |c - a + \mu d - \lambda b| \end{aligned} \end{equation*} \begin{equation*} \begin{aligned} D = \left\| \left(a - c\right) - \left[\vphantom{\frac{b}{b}}\right.\right. & \frac{|d|^2}{|b \times d|^2} \left.\left. \left(\left(a - c\right)\cdot \left(b - \frac{b \cdot d}{|b|\cdot|d|^2} \cdot d\right)\right) \cdot b \right.\right.+\\ & \frac{|b|^2}{|b \times d|^2} \left.\left. \left(\left(a - c\right)\cdot \left(d - \frac{b \cdot d}{|b|^2\cdot|d|} \cdot b\right)\right) \cdot d \right] \right\| \end{aligned} \end{equation*} \begin{equation*} \begin{aligned} D = \left\| \left(a - c\right) - \frac{|b|^2\cdot |d|^2}{|b \times d|^2} \left[\vphantom{\frac{b}{b}}\right.\right. & \left.\left. \left(\left(a - c\right)\cdot \left(\frac{b}{|b|} - \frac{b \cdot d}{|b|\cdot|d|} \cdot \frac{d}{|d|}\right)\right) \cdot \frac{b}{b} \right.\right.+\\ & \left.\left. \left(\left(a - c\right)\cdot \left(\frac{d}{|d|} - \frac{b \cdot d}{|b|\cdot|d|} \cdot \frac{b}{|b|}\right)\right) \cdot \frac{d}{d} \right] \right\| \end{aligned} \end{equation*}
Then I don't see how to proceed...
As pointed out by @ancientmathematician, we could set $u=a-c$, and assume $|b|=|d|=1$ without loss of generality, then \begin{equation*} \begin{aligned} D = \left\| u - \frac{1}{|b \times d|^2} \left[\vphantom{\frac{b}{b}}\right.\right. & \left.\left. \left(u\cdot \left(b - (b \cdot d) \cdot d\right)\right) \cdot b \right.\right.+\\ & \left.\left. \left(u\cdot \left(d - (b \cdot d) \cdot b\right)\right) \cdot d \vphantom{\frac{b}{b}} \right] \right\| \end{aligned} \end{equation*}
and $|b \times d|^{-2}= \sin(\alpha)^{-2}$. But the occurence of the angle between $b$ and $d$ at that place is quite puzzling to me. Do I miss something obvious?
Initially I thought \begin{equation*} \begin{aligned} D = \left\| u - \left[\right.\right. \left.\left. \left(u\cdot \left(b - (b \cdot d) \cdot d\right)\right) \cdot b \right.\right.+ \left.\left. \left(u\cdot \left(d - (b \cdot d) \cdot b\right)\right) \cdot d \right] \right\| \end{aligned} \end{equation*} which is wrong...