The Distribution of the Linear Interpolation of Normally Sampled Points

Question

This is mostly me reasking this question because I believe I have an alternative approach to a similar idea, whereas this seems to be adding some kind of discretization instead.

Let $(X_i)_{i=0,\cdots,n}$ be i.i.d normal standard normal variables. We define the linear interpolation $L_n((X_i)_{i=0,\cdots,n},Y)$ to be:

$$ L_n((X_i)_{i=0,\cdots,n},Y) = \begin{cases} X_1 + (X_2 - X_1)Y & 0 \leq Y \leq 1\\ X_2 + (X_3 - X_2)(Y - 1) & 1 < Y \leq 2 \\ \vdots \\ X_{n-1} + (X_n - X_{n-1})(Y - n - 1) & n-1 < Y \leq n \end{cases} $$

Giving us the intuitive plot for a given sample of $(X_i)_{i=0,\cdots,n}$:

Reasking the original question with this new approach:

Assuming that $Y\sim \mathcal{U}(0,n)$ what is the distribution of $L_n((X_i)_{i=0,\cdots,n},Y)$?
In the limit at $n\to\infty$ what is the limit distribution $L_\infty((X_i)_{i=0,\cdots,n},Y)$?

If you write $L_n(y)$ like that, it means that $x_1, ...,x_n$ are not variables of $L$ and are known (in other words, not random). I suggest to write $L(Y, (X_i) _{i=1,...,n})$ to indicate $L$ is a function of $Y$ and $X_i$ for $i=1,..,n$. $Y$ and $X$ are written in capital letter to indicate that they are random variables. — NN2, Oct 03 '23 at 13:03

Thomas Pluck · Accepted Answer · 2023-10-03T16:01:15.873

Focusing on a single segment we have the product between a normal and uniform distribution given by:

$$X_i+(X_{i+1}-X_i)Y$$

So we're looking at the product between $A\sim\mathcal{N}(0,2)$ and $B\sim U(0,1)$ plus $C\sim\mathcal{N}(0,1)$ - this isn't a "named" distribution - we'll start by focusing on $AB$ which we know is product of a uniform and a normal distribution - so is given by the PDF:

$$ PDF_{AB}(z) = \frac{\Gamma(0, \frac{z^2}{4})}{4 \sqrt{\pi}} $$

Where Gamma in this context is the upper incomplete gamma function, it looks like this:

This is not a normal distribution, and computing, $AB+C$ requires a convolution that I'm not sure how to do: $$ \text{PDF}_{AB+C}(z) = \frac{\Gamma(0, \frac{z^2}{4})}{4 \sqrt{\pi}} * \dfrac{1}{\sqrt{2\pi}}e^\frac{-z^2}{2} $$

However, using Monte Carlo simulations, we get the following distribution which isn't normal:

It fails the Shapiro-Wilks, Anderson-Darling and Kolmogorov-Smirnov normality tests and has an excess kurtosis of $0.369$, so it is very slightly leptokurtic or fat-tailed.

Extending this to arbitrary/intervals, we must always select from one of these intervals that we've already defined, so we will always sample from this distribution, even in the limit.

The Distribution of the Linear Interpolation of Normally Sampled Points

1 Answers1