1

I want to use the Fisher–Neyman factorization theorem, of the form $L(\mu; \mathbf{y}) = g(T(\mathbf{y}), \mu) \times h(\mathbf{y})$, to factor $\exp{\left\{ \dfrac{1}{2 \mu} \left( \sum_{i = 1}^n y_{2i}^2 - \sum_{i = 1}^n y_{1i}^2 \right) + \left( \sum_{i = 1}^n y_{1i} - \sum_{i = 1}^n y_{2i} \right) \right\}}$ and show that the statistic $\left(\sum_{i = 1}^n Y_i, \sum_{i = 1}^n Y_i^2 \right)$ is sufficient for $\mu$ but not minimal sufficient where $(Y_1, \dots, Y_n)$ is a random sample from $N(\mu, \mu)$ for $\mu > 0$. So we immediately know that we have $T(\mathbf{Y}) = \left(\sum_{i = 1}^n Y_i, \sum_{i = 1}^n Y_i^2 \right)$. And since $\dfrac{1}{2 \mu} \left( \sum_{i = 1}^n y_{2i}^2 - \sum_{i = 1}^n y_{1i}^2 \right)$ has the parameter $\mu$, and $\left( \sum_{i = 1}^n y_{1i} - \sum_{i = 1}^n y_{2i} \right)$ has the data for $\sum_{i = 1}^n Y_i$, I would say that we require $g(T(\mathbf{y}), \mu) = \dfrac{1}{2 \mu} \left( \sum_{i = 1}^n y_{2i}^2 - \sum_{i = 1}^n y_{1i}^2 \right) + \left( \sum_{i = 1}^n y_{1i} - \sum_{i = 1}^n y_{2i} \right)$, and so we require that $h(\mathbf{y}) = 1$. This way, we get the Fisher-Neyman factorization

$$L(\mu; \mathbf{y}) = \left( \dfrac{1}{2 \mu} \left( \sum_{i = 1}^n y_{2i}^2 - \sum_{i = 1}^n y_{1i}^2 \right) + \left( \sum_{i = 1}^n y_{1i} - \sum_{i = 1}^n y_{2i} \right) \right) \times 1 = \left( \dfrac{1}{2 \mu} \left( \sum_{i = 1}^n y_{2i}^2 - \sum_{i = 1}^n y_{1i}^2 \right) + \left( \sum_{i = 1}^n y_{1i} - \sum_{i = 1}^n y_{2i} \right) \right)$$

And so this shows that $\left(\sum_{i = 1}^n Y_i, \sum_{i = 1}^n Y_i^2 \right)$ is a sufficient statistic.

Have I done this correctly? If not, then what is the correct way to do this?

The Pointer
  • 4,182
  • For a $N(\mu,\mu)$ model, a minimal sufficient statistic is $\sum Y_i^2$. Then $(\sum Y_i^2,T)$ is trivially sufficient for any statistic $T$, but of course no longer minimal sufficient. – StubbornAtom Apr 18 '21 at 11:22
  • @StubbornAtom I already found that $\sum Y_i^2$ is minimal sufficient, but how do I justify that $\left(\sum_{i = 1}^n Y_i, \sum_{i = 1}^n Y_i^2 \right)$ is actually a sufficient statistic? That's what I was trying to do here. My reasoning was to show that $\sum Y_i^2$ is a minimal sufficient statistic (and $\sum Y_i$ is not), and then show that $T(\mathbf{Y}) = \left(\sum_{i = 1}^n Y_i, \sum_{i = 1}^n Y_i^2 \right)$ is sufficient. So, therefore, we will have shown that $T(\mathbf{Y}) = \left(\sum_{i = 1}^n Y_i, \sum_{i = 1}^n Y_i^2 \right)$ is sufficient but not minimal sufficient. – The Pointer Apr 18 '21 at 11:25
  • On a related note, my work for showing minimal sufficiency was posted here https://stats.stackexchange.com/q/520269/163242 – The Pointer Apr 18 '21 at 11:26
  • @StubbornAtom And when you say "then $(\sum Y_i^2,T)$ is trivially sufficient for any statistic $T$," what is the reasoning that justifies this? – The Pointer Apr 18 '21 at 11:28

1 Answers1

0

In a $N(\mu,\mu)$ model with $\mu > 0$, the statistic $\sum_{i = 1}^n Y_i^2$ is minimal sufficient. This means $\sum_{i = 1}^n Y_i^2$ induces the coarsest partition of the sample space resulting in the maximum possible data reduction without losing information about the parameter $\mu$. Since $\sum_{i=1}^n Y_i^2$ is already minimal sufficient, the pair $\left(\sum_{i = 1}^n Y_i^2, T \right)$ remains sufficient for any statistic $T$ in the sense that the component $T$ makes no further contribution in data reduction. This is the intuitive justification.

The Pointer
  • 4,182
StubbornAtom
  • 17,052
  • Hmm, interesting. I guess that makes sense – it sounds analogous to 'minimum spanning sets' in linear algebra. – The Pointer Apr 18 '21 at 17:08